R-universe search: rules

tidymodels

rules:Model Wrappers for Rule-Based Models

Bindings for additional models for use with the 'parsnip' package. Models include prediction rule ensembles (Friedman and Popescu, 2008) <doi:10.1214/07-AOAS148>, C5.0 rules (Quinlan, 1992 ISBN: 1558602380), and Cubist (Kuhn and Johnson, 2013) <doi:10.1007/978-1-4614-6849-3>.

Maintained by Emil Hvitfeldt. Last updated 5 months ago.

57.2 match 40 stars 9.52 score 20k scripts 1 dependents

mhahsler

arules:Mining Association Rules and Frequent Itemsets

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.

Maintained by Michael Hahsler. Last updated 1 months ago.

arules association-rules frequent-itemsets

30.6 match 194 stars 13.99 score 3.3k scripts 28 dependents

marjoleinf

pre:Prediction Rule Ensembles

Derives prediction rule ensembles (PREs). Largely follows the procedure for deriving PREs as described in Friedman & Popescu (2008; <DOI:10.1214/07-AOAS148>), with adjustments and improvements. The main function pre() derives prediction rule ensembles consisting of rules and/or linear terms for continuous, binary, count, multinomial, and multivariate continuous responses. Function gpe() derives generalized prediction ensembles, consisting of rules, hinge and linear functions of the predictor variables.

Maintained by Marjolein Fokkema. Last updated 9 months ago.

38.2 match 58 stars 8.55 score 98 scripts 1 dependents

assuom44

arlclustering:Exploring Social Network Structures Through Friendship-Driven Community Detection with Association Rules Mining

Implements an innovative approach to community detection in social networks using Association Rules Learning. The package provides tools for processing graph and rules objects, generating association rules, and detecting communities based on node interactions. Designed to facilitate advanced research in Social Network Analysis, this package leverages association rules learning for enhanced community detection. This approach is described in El-Moussaoui et al. (2021) <doi:10.1007/978-3-030-66840-2_3>.

Maintained by Mohamed El-Moussaoui. Last updated 6 months ago.

38.1 match 6.45 score 50 scripts

mhahsler

arulesViz:Visualizing Association Rules and Frequent Itemsets

Extends package 'arules' with various visualization techniques for association rules and itemsets. The package also includes several interactive visualizations for rule exploration. Michael Hahsler (2017) <doi:10.32614/RJ-2017-047>.

Maintained by Michael Hahsler. Last updated 7 months ago.

arules association-rules frequent-itemsets interactive-visualizations visualization

21.8 match 54 stars 11.00 score 1.7k scripts 2 dependents

echasnovski

ruler:Tidy Data Validation Reports

Tools for creating data validation pipelines and tidy reports. This package offers a framework for exploring and validating data frame like objects using 'dplyr' grammar of data manipulation.

Maintained by Evgeni Chasnovski. Last updated 2 years ago.

35.7 match 31 stars 6.70 score 36 scripts

janusza

RoughSets:Data Analysis Using Rough Set and Fuzzy Rough Set Theories

Implementations of algorithms for data analysis based on the rough set theory (RST) and the fuzzy rough set theory (FRST). We not only provide implementations for the basic concepts of RST and FRST but also popular algorithms that derive from those theories. The methods included in the package can be divided into several categories based on their functionality: discretization, feature selection, instance selection, rule induction and classification based on nearest neighbors. RST was introduced by Zdzisław Pawlak in 1982 as a sophisticated mathematical tool to model and process imprecise or incomplete information. By using the indiscernibility relation for objects/instances, RST does not require additional parameters to analyze the data. FRST is an extension of RST. The FRST combines concepts of vagueness and indiscernibility that are expressed with fuzzy sets (as proposed by Zadeh, in 1965) and RST.

Maintained by Christoph Bergmeir. Last updated 5 years ago.

cpp

33.7 match 37 stars 5.61 score 185 scripts

beerda

lfl:Linguistic Fuzzy Logic

Various algorithms related to linguistic fuzzy logic: mining for linguistic fuzzy association rules, composition of fuzzy relations, performing perception-based logical deduction (PbLD), and forecasting time-series using fuzzy rule-based ensemble (FRBE). The package also contains basic fuzzy-related algebraic functions capable of handling missing values in different styles (Bochvar, Sobocinski, Kleene etc.), computation of Sugeno integrals and fuzzy transform.

Maintained by Michal Burda. Last updated 5 months ago.

association-rules forecast-model fuzzy-logic inference-rules cpp openmp

35.1 match 8 stars 5.35 score 28 scripts

bioc

TFARM:Transcription Factors Association Rules Miner

It searches for relevant associations of transcription factors with a transcription factor target, in specific genomic regions. It also allows to evaluate the Importance Index distribution of transcription factors (and combinations of transcription factors) in association rules.

Maintained by Liuba Nausicaa Martino. Last updated 5 months ago.

biologicalquestion infrastructure statisticalmethod transcription

46.8 match 4.00 score 2 scripts

data-cleaning

validatetools:Checking and Simplifying Validation Rule Sets

Rule sets with validation rules may contain redundancies or contradictions. Functions for finding redundancies and problematic rules are provided, given a set a rules formulated with 'validate'.

Maintained by Edwin de Jonge. Last updated 9 months ago.

data-cleaning rules validation

41.9 match 15 stars 4.47 score 39 scripts

data-cleaning

validate:Data Validation Infrastructure

Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.

Maintained by Mark van der Loo. Last updated 15 days ago.

data-cleaning validation

14.7 match 418 stars 12.50 score 448 scripts 9 dependents

davisvaughan

almanac:Tools for Working with Recurrence Rules

Provides tools for defining recurrence rules and recurrence sets. Recurrence rules are a programmatic way to define a recurring event, like the first Monday of December. Multiple recurrence rules can be combined into larger recurrence sets. A full holiday and calendar interface is also provided that can generate holidays within a particular year, can detect if a date is a holiday, can respect holiday observance rules, and allows for custom holidays.

Maintained by Davis Vaughan. Last updated 2 years ago.

calendars holidays recurrence-rules

20.4 match 73 stars 8.40 score 65 scripts 1 dependents

cran

gaussquad:Collection of Functions for Gaussian Quadrature

A collection of functions to perform Gaussian quadrature with different weight functions corresponding to the orthogonal polynomials in package orthopolynom. Examples verify the orthogonality and inner products of the polynomials.

Maintained by Frederick Novomestky. Last updated 3 years ago.

74.8 match 2.18 score 5 dependents

fk83

scoringRules:Scoring Rules for Parametric and Simulated Distribution Forecasts

Dictionary-like reference for computing scoring rules in a wide range of situations. Covers both parametric forecast distributions (such as mixtures of Gaussians) and distributions generated via simulation. Further details can be found in the package vignettes <doi:10.18637/jss.v090.i12>, <doi:10.18637/jss.v110.i08>.

Maintained by Fabian Krueger. Last updated 6 months ago.

openblas cpp

14.2 match 59 stars 11.33 score 408 scripts 13 dependents

data-cleaning

errorlocate:Locate Errors with Validation Rules

Errors in data can be located and removed using validation rules from package 'validate'. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, chapter 7.

Maintained by Edwin de Jonge. Last updated 9 months ago.

data-cleaning errors invalidation

24.8 match 22 stars 6.11 score 59 scripts

bayes-rules

bayesrules:Datasets and Supplemental Functions from Bayes Rules! Book

Provides datasets and functions used for analysis and visualizations in the Bayes Rules! book (<https://www.bayesrulesbook.com>). The package contains a set of functions that summarize and plot Bayesian models from some conjugate families and another set of functions for evaluation of some Bayesian models.

Maintained by Mine Dogucu. Last updated 3 years ago.

bayesian-statistics data

18.4 match 72 stars 8.06 score 466 scripts

r-lib

cli:Helpers for Developing Command Line Interfaces

A suite of tools to build attractive command line interfaces ('CLIs'), from semantic elements: headings, lists, alerts, paragraphs, etc. Supports custom themes via a 'CSS'-like language. It also contains a number of lower level 'CLI' elements: rules, boxes, trees, and 'Unicode' symbols with 'ASCII' alternatives. It support ANSI colors and text styles as well.

Maintained by Gábor Csárdi. Last updated 2 days ago.

cli

7.6 match 664 stars 19.34 score 1.4k scripts 14k dependents

dexter-psychometrics

dexter:Data Management and Analysis of Tests

A system for the management, assessment, and psychometric analysis of data from educational and psychological tests.

Maintained by Jesse Koops. Last updated 8 days ago.

openblas cpp openmp

16.0 match 8 stars 8.97 score 135 scripts 2 dependents

joshuaulrich

TTR:Technical Trading Rules

A collection of over 50 technical indicators for creating technical trading rules. The package also provides fast implementations of common rolling-window functions, and several volatility calculations.

Maintained by Joshua Ulrich. Last updated 1 years ago.

algorithmic-trading finance technical-analysis

9.2 match 338 stars 15.11 score 2.8k scripts 359 dependents

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

19.6 match 145 stars 7.09 score 50 scripts 2 dependents

moore-institute-4-plastic-pollution-res

One4All:Validate, Share, and Download Data

Designed to enhance data validation and management processes by employing a set of functions that read a set of rules from a 'CSV' or 'Excel' file and apply them to a dataset. Funded by the National Renewable Energy Laboratory and Possibility Lab, maintained by the Moore Institute for Plastic Pollution Research.

Maintained by Hannah Sherrod. Last updated 8 months ago.

21.1 match 3 stars 6.33 score 15 scripts

mhahsler

arulesCBA:Classification Based on Association Rules

Provides the infrastructure for association rule-based classification including the algorithms CBA, CMAR, CPAR, C4.5, FOIL, PART, PRM, RCAR, and RIPPER to build associative classifiers. Hahsler et al (2019) <doi:10.32614/RJ-2019-048>.

Maintained by Michael Hahsler. Last updated 7 months ago.

association-rules classification

23.4 match 3 stars 5.49 score 47 scripts 1 dependents

bradleyjeck

epanet2toolkit:Call 'EPANET' Functions to Simulate Pipe Networks

Enables simulation of water piping networks using 'EPANET'. The package provides functions from the 'EPANET' programmer's toolkit as R functions so that basic or customized simulations can be carried out from R. The package uses 'EPANET' version 2.2 from Open Water Analytics <https://github.com/OpenWaterAnalytics/EPANET/releases/tag/v2.2>.

Maintained by Bradley Eck. Last updated 3 months ago.

epanet epanet-api simulation water water-distribution-networks

24.4 match 15 stars 5.17 score 66 scripts

beerda

rmake:Makefile Generator for R Analytical Projects

Creates and maintains a build process for complex analytic tasks in R. Package allows to easily generate Makefile for the (GNU) 'make' tool, which drives the build process by (in parallel) executing build commands in order to update results accordingly to given dependencies on changed data or updated source files.

Maintained by Michal Burda. Last updated 3 years ago.

makefile rmake

29.9 match 1 stars 4.11 score 26 scripts

insightsengineering

chevron:Standard TLGs for Clinical Trials Reporting

Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.

Maintained by Joe Zhu. Last updated 27 days ago.

clinical-trials graphs listings nest reporting tables

14.6 match 12 stars 8.24 score 12 scripts

insightsengineering

dunlin:Preprocessing Tools for Clinical Trial Data

A collection of functions to preprocess data and organize them in a format amenable to use by chevron.

Maintained by Joe Zhu. Last updated 27 days ago.

15.6 match 4 stars 7.38 score 30 scripts 1 dependents

merck

Rspc:Nelson Rules for Control Charts

Description: Rspc is an implementation of nelson rules for control charts in R. The Rspc package implements some Statistical Process Control methods, namely Levey-Jennings type of I (individuals) chart, Shewhart C (count) chart and Nelson rules. Typical workflow is taking the time series, specify the control limits, and list of Nelson rules you want to evaluate. There are several options how to modify the rules (one sided limits, numerical parameters of rules, etc.). Package is also capable of calculating the control limits from the data (so far only for i-chart and c-chart are implemented).

Maintained by Stanislav Matousek. Last updated 7 years ago.

22.2 match 1 stars 5.04 score 24 scripts

r-lib

styler:Non-Invasive Pretty Printing of R Code

Pretty-prints R code without changing the user's formatting intent.

Maintained by Lorenz Walthert. Last updated 1 months ago.

pretty-print

6.5 match 754 stars 16.15 score 940 scripts 62 dependents

cran

stoppingrule:Create and Evaluate Stopping Rules for Safety Monitoring

Provides functions for creating, displaying, and evaluating stopping rules for safety monitoring in clinical studies.

Maintained by Michael J. Martens. Last updated 1 months ago.

44.9 match 2.30 score

bioc

TVTB:TVTB: The VCF Tool Box

The package provides S4 classes and methods to filter, summarise and visualise genetic variation data stored in VCF files. In particular, the package extends the FilterRules class (S4Vectors package) to define news classes of filter rules applicable to the various slots of VCF objects. Functionalities are integrated and demonstrated in a Shiny web-application, the Shiny Variant Explorer (tSVE).

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software genetics geneticvariability genomicvariation datarepresentation gui dnaseq wholegenome visualization multiplecomparison dataimport variantannotation sequencing coverage alignment sequencematching

17.5 match 2 stars 5.76 score 16 scripts

easystats

effectsize:Indices of Effect Size

Provide utilities to work with indices of effect size for a wide variety of models and hypothesis tests (see list of supported models using the function 'insight::supported_models()'), allowing computation of and conversion between indices such as Cohen's d, r, odds, etc. References: Ben-Shachar et al. (2020) <doi:10.21105/joss.02815>.

Maintained by Mattan S. Ben-Shachar. Last updated 2 months ago.

anova cohens-d compute conversion correlation effect-size effectsize hacktoberfest hedges-g interpretation standardization standardized statistics

6.0 match 344 stars 16.38 score 1.8k scripts 29 dependents

openpharma

crmPack:Object-Oriented Implementation of CRM Designs

Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules. Further details are presented in Sabanes Bove et al. (2019) <doi:10.18637/jss.v089.i10>.

Maintained by Daniel Sabanes Bove. Last updated 2 months ago.

jags cpp

12.6 match 21 stars 7.79 score 208 scripts

pacificcommunity

AMPLE:Shiny Apps to Support Capacity Building on Harvest Control Rules

Three Shiny apps are provided that introduce Harvest Control Rules (HCR) for fisheries management. 'Introduction to HCRs' provides a simple overview to how HCRs work. Users are able to select their own HCR and step through its performance, year by year. Biological variability and estimation uncertainty are introduced. 'Measuring performance' builds on the previous app and introduces the idea of using performance indicators to measure HCR performance. 'Comparing performance' allows multiple HCRs to be created and tested, and their performance compared so that the preferred HCR can be selected.

Maintained by Finlay Scott. Last updated 1 years ago.

ofp sam shiny

17.6 match 5.56 score 24 scripts

rapporter

pander:An R 'Pandoc' Writer

Contains some functions catching all messages, 'stdout' and other useful information while evaluating R code and other helpers to return user specified text elements (like: header, paragraph, table, image, lists etc.) in 'pandoc' markdown or several type of R objects similarly automatically transformed to markdown format. Also capable of exporting/converting (the resulting) complex 'pandoc' documents to e.g. HTML, 'PDF', 'docx' or 'odt'. This latter reporting feature is supported in brew syntax or with a custom reference class with a smarty caching 'backend'.

Maintained by Gergely Daróczi. Last updated 19 days ago.

literate-programming markdown pandoc pandoc-markdown reproducible-research rmarkdown cpp

5.9 match 297 stars 16.60 score 7.6k scripts 108 dependents

nacnudus

tidyxl:Read Untidy Excel Files

Imports non-tabular from Excel files into R. Exposes cell content, position and formatting in a tidy structure for further manipulation. Tokenizes Excel formulas. Supports '.xlsx' and '.xlsm' via the embedded 'RapidXML' C++ library <https://rapidxml.sourceforge.net>. Does not support '.xlsb' or '.xls'.

Maintained by Duncan Garmonsway. Last updated 1 years ago.

excelreader rcpp spreadsheet tidy cpp

9.0 match 251 stars 10.69 score 382 scripts 13 dependents

firefly-cpp

niarules:Numerical Association Rule Mining using Population-Based Nature-Inspired Algorithms

Framework is devoted to mining numerical association rules through the utilization of nature-inspired algorithms for optimization. Drawing inspiration from the 'NiaARM' 'Python' and the 'NiaARM' 'Julia' packages, this repository introduces the capability to perform numerical association rule mining in the R programming language. Fister Jr., Iglesias, Galvez, Del Ser, Osaba and Fister (2018) <doi:10.1007/978-3-030-03493-1_9>.

Maintained by Iztok Jr. Fister. Last updated 15 days ago.

association-rules metaheuristics optimization

25.9 match 1 stars 3.70 score 2 scripts

cloudyr

googleComputeEngineR:R Interface with Google Compute Engine

Interact with the 'Google Compute Engine' API in R. Lets you create, start and stop instances in the 'Google Cloud'. Support for preconfigured instances, with templates for common R needs.

Maintained by Mark Edmondson. Last updated 4 days ago.

api cloud-computing cloudyr google-cloud googleauthr launching-virtual-machines

9.6 match 152 stars 9.73 score 235 scripts

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 3 days ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

9.2 match 1 stars 10.18 score 67 scripts 149 dependents

spatstat

spatstat.random:Random Generation Functionality for the 'spatstat' Family

Functionality for random generation of spatial data in the 'spatstat' family of packages. Generates random spatial patterns of points according to many simple rules (complete spatial randomness, Poisson, binomial, random grid, systematic, cell), randomised alteration of patterns (thinning, random shift, jittering), simulated realisations of random point processes including simple sequential inhibition, Matern inhibition models, Neyman-Scott cluster processes (using direct, Brix-Kendall, or hybrid algorithms), log-Gaussian Cox processes, product shot noise cluster processes and Gibbs point processes (using Metropolis-Hastings birth-death-shift algorithm, alternating Gibbs sampler, or coupling-from-the-past perfect simulation). Also generates random spatial patterns of line segments, random tessellations, and random images (random noise, random mosaics). Excludes random generation on a linear network, which is covered by the separate package 'spatstat.linnet'.

Maintained by Adrian Baddeley. Last updated 18 hours ago.

point-processes random-generation simulation spatial-sampling spatial-simulation cpp

8.1 match 5 stars 10.85 score 84 scripts 175 dependents

topepo

Cubist:Rule- And Instance-Based Regression Modeling

Regression modeling using rules with added instance-based corrections.

Maintained by Max Kuhn. Last updated 9 months ago.

6.9 match 40 stars 12.38 score 2.8k scripts 18 dependents

yizhenxu

TGST:Targeted Gold Standard Testing

Functions for implementing the targeted gold standard (GS) testing. You provide the true disease or treatment failure status and the risk score, tell 'TGST' the availability of GS tests and which method to use, and it returns the optimal tripartite rules. Please refer to Liu et al. (2013) <doi:10.1080/01621459.2013.810149> for more details.

Maintained by Yizhen Xu. Last updated 4 years ago.

22.4 match 3.70 score

ralmond

EIEvent:Evidence Identification Event Processing Engine

Extracts observables from a sequence of events. Uses a prolog-like rule language to do the extraction, written in JSON.

Maintained by Russell Almond. Last updated 1 years ago.

assessment-scoring evidence-centered-design

40.6 match 2.00 score 2 scripts

blue-matter

SAMtool:Stock Assessment Methods Toolkit

Simulation tools for closed-loop simulation are provided for the 'MSEtool' operating model to inform data-rich fisheries. 'SAMtool' provides a conditioning model, assessment models of varying complexity with standardized reporting, model-based management procedures, and diagnostic tools for evaluating assessments inside closed-loop simulation.

Maintained by Quang Huynh. Last updated 23 days ago.

cpp

12.3 match 3 stars 6.49 score 36 scripts 1 dependents

topepo

C50:C5.0 Decision Trees and Rule-Based Models

C5.0 decision trees and rule-based models for pattern recognition that extend the work of Quinlan (1993, ISBN:1-55860-238-0).

Maintained by Max Kuhn. Last updated 2 years ago.

6.4 match 50 stars 11.99 score 1.3k scripts 13 dependents

bioc

snpStats:SnpMatrix and XSnpMatrix classes and methods

Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.

Maintained by David Clayton. Last updated 5 months ago.

microarray snp geneticvariability zlib

8.0 match 9.48 score 674 scripts 20 dependents

matsuurakentaro

RLoptimal:Optimal Adaptive Allocation Using Deep Reinforcement Learning

An implementation to compute an optimal adaptive allocation rule using deep reinforcement learning in a dose-response study (Matsuura et al. (2022) <doi:10.1002/sim.9247>). The adaptive allocation rule can directly optimize a performance metric, such as power, accuracy of the estimated target dose, or mean absolute error over the estimated dose-response curve.

Maintained by Kentaro Matsuura. Last updated 2 months ago.

12.7 match 4 stars 5.95 score 21 scripts

i02momuj

RKEEL:Using 'KEEL' in R Code

'KEEL' is a popular 'Java' software for a large number of different knowledge data discovery tasks. This package takes the advantages of 'KEEL' and R, allowing to use 'KEEL' algorithms in simple R code. The implemented R code layer between R and 'KEEL' makes easy both using 'KEEL' algorithms in R as implementing new algorithms for 'RKEEL' in a very simple way. It includes more than 100 algorithms for classification, regression, preprocess, association rules and imbalance learning, which allows a more complete experimentation process. For more information about 'KEEL', see <http://www.keel.es/>.

Maintained by Jose M. Moyano. Last updated 2 years ago.

openjdk

31.3 match 2 stars 2.41 score 130 scripts

cran

matRiks:Generates Raven-Like Matrices According to Rules

Generates Raven like matrices according to different rules and the response list associated to the matrix. The package can generate matrices composed of 4 or 9 cells, along with a response list of 11 elements (the correct response + 10 incorrect responses). The matrices can be generated according to both logical rules (i.e., the relationships between the elements in the matrix are manipulated to create the matrix) and visual-spatial rules (i.e., the visual or spatial characteristics of the elements are manipulated to generate the matrix). The graphical elements of this package are based on the 'DescTools' package. This package has been developed within the PRIN2020 Project (Prot. 20209WKCLL) titled "Computerized, Adaptive and Personalized Assessment of Executive Functions and Fluid Intelligence" and founded by the Italian Ministry of Education and Research.

Maintained by Andrea Brancaccio. Last updated 1 years ago.

19.6 match 3.60 score

tlverse

tmle3mopttx:Targeted Maximum Likelihood Estimation of the Mean under Optimal Individualized Treatment

This package estimates the optimal individualized treatment rule for the categorical treatment using Super Learner (sl3). In order to avoid nested cross-validation, it uses split-specific estimates of Q and g to estimate the rule as described by Coyle et al. In addition, it provides the Targeted Maximum Likelihood estimates of the mean performance using CV-TMLE under such estimated rules. This is an adapter package for use with the tmle3 framework and the tlverse software ecosystem for Targeted Learning.

Maintained by Ivana Malenica. Last updated 3 years ago.

categorical-treatment causal-inference heterogeneous-effects machine-learning optimal-individualized-treatment targeted-learning variable-importance

16.5 match 12 stars 4.25 score 49 scripts 1 dependents

malaga-fca-group

fcaR:Formal Concept Analysis

Provides tools to perform fuzzy formal concept analysis, presented in Wille (1982) <doi:10.1007/978-3-642-01815-2_23> and in Ganter and Obiedkov (2016) <doi:10.1007/978-3-662-49291-8>. It provides functions to load and save a formal context, extract its concept lattice and implications. In addition, one can use the implications to compute semantic closures of fuzzy sets and, thus, build recommendation systems.

Maintained by Domingo Lopez Rodriguez. Last updated 2 years ago.

formal-concept-analysis cpp

11.4 match 6 stars 6.02 score 70 scripts

skranz

gtree:gtree basic functionality to model and solve games

gtree basic functionality to model and solve games

Maintained by Sebastian Kranz. Last updated 4 years ago.

economic-experiments economics gambit game-theory nash-equilibrium

18.1 match 18 stars 3.79 score 23 scripts 1 dependents

davzim

dataverifyr:A Lightweight, Flexible, and Fast Data Validation Package that Can Handle All Sizes of Data

Allows you to define rules which can be used to verify a given dataset. The package acts as a thin wrapper around more powerful data packages such as 'dplyr', 'data.table', 'arrow', and 'DBI' ('SQL'), which do the heavy lifting.

Maintained by David Zimmermann-Kollenda. Last updated 1 years ago.

verification

16.6 match 27 stars 4.13 score 7 scripts

pnovack-gottshall

ecospace:Simulating Community Assembly and Ecological Diversification Using Ecospace Frameworks

Implements stochastic simulations of community assembly (ecological diversification) using customizable ecospace frameworks (functional trait spaces). Provides a wrapper to calculate common ecological disparity and functional ecology statistical dynamics as a function of species richness. Functions are written so they will work in a parallel-computing environment.

Maintained by Phil Novack-Gottshall. Last updated 5 years ago.

14.8 match 5 stars 4.51 score 13 scripts

beerda

nuggets:Extensible Data Pattern Searching Framework

Extensible framework for subgroup discovery (Atzmueller (2015) <doi:10.1002/widm.1144>), contrast patterns (Chen (2022) <doi:10.48550/arXiv.2209.13556>), emerging patterns (Dong (1999) <doi:10.1145/312129.312191>), association rules (Agrawal (1994) <https://www.vldb.org/conf/1994/P487.PDF>) and conditional correlations (Hájek (1978) <doi:10.1007/978-3-642-66943-9>). Both crisp (Boolean, binary) and fuzzy data are supported. It generates conditions in the form of elementary conjunctions, evaluates them on a dataset and checks the induced sub-data for interesting statistical properties. A user-defined function may be defined to evaluate on each generated condition to search for custom patterns.

Maintained by Michal Burda. Last updated 7 days ago.

association-rule-mining contrast-pattern-mining data-mining fuzzy knowledge-discovery pattern-recognition cpp openmp

12.4 match 2 stars 5.38 score 10 scripts

john-harrold

ruminate:A Pharmacometrics Data Transformation and Analysis Tool

Exploration of pharmacometrics data involves both general tools (transformation and plotting) and specific techniques (non-compartmental analysis). This kind of exploration is generally accomplished by utilizing different packages. The purpose of 'ruminate' is to create a 'shiny' interface to make these tools more broadly available while creating reproducible results.

Maintained by John Harrold. Last updated 9 days ago.

9.3 match 2 stars 7.06 score 84 scripts

nourmarzouka

multiclassPairs:Build MultiClass Pair-Based Classifiers using TSPs or RF

A toolbox to train a single sample classifier that uses in-sample feature relationships. The relationships are represented as feature1 < feature2 (e.g. gene1 < gene2). We provide two options to go with. First is based on 'switchBox' package which uses Top-score pairs algorithm. Second is a novel implementation based on random forest algorithm. For simple problems we recommend to use one-vs-rest using TSP option due to its simplicity and for being easy to interpret. For complex problems RF performs better. Both lines filter the features first then combine the filtered features to make the list of all the possible rules (i.e. rule1: feature1 < feature2, rule2: feature1 < feature3, etc...). Then the list of rules will be filtered and the most important and informative rules will be kept. The informative rules will be assembled in an one-vs-rest model or in an RF model. We provide a detailed description with each function in this package to explain the filtration and training methodology in each line. Reference: Marzouka & Eriksson (2021) <doi:10.1093/bioinformatics/btab088>.

Maintained by Nour-al-dain Marzouka. Last updated 2 years ago.

classification

12.9 match 12 stars 4.82 score 11 scripts

r-cas

Ryacas:R Interface to the 'Yacas' Computer Algebra System

Interface to the 'yacas' computer algebra system (<http://www.yacas.org/>).

Maintained by Mikkel Meyer Andersen. Last updated 2 years ago.

cpp

6.1 match 40 stars 10.15 score 167 scripts 14 dependents

alexkowa

EnvStats:Package for Environmental Statistics, Including US EPA Guidance

Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).

Maintained by Alexander Kowarik. Last updated 20 days ago.

4.8 match 26 stars 12.80 score 2.4k scripts 46 dependents

ropensci

datapack:A Flexible Container to Transport and Manipulate Data and Associated Resources

Provides a flexible container to transport and manipulate complex sets of data. These data may consist of multiple data files and associated meta data and ancillary files. Individual data objects have associated system level meta data, and data files are linked together using the OAI-ORE standard resource map which describes the relationships between the files. The OAI- ORE standard is described at <https://www.openarchives.org/ore/>. Data packages can be serialized and transported as structured files that have been created following the BagIt specification. The BagIt specification is described at <https://tools.ietf.org/html/draft-kunze-bagit-08>.

Maintained by Matthew B. Jones. Last updated 3 years ago.

7.2 match 44 stars 8.56 score 195 scripts 4 dependents

mariechion

mi4p:Multiple Imputation for Proteomics

A framework for multiple imputation for proteomics is proposed by Marie Chion, Christine Carapito and Frederic Bertrand (2021) <doi:10.1371/journal.pcbi.1010420>. It is dedicated to dealing with multiple imputation for proteomics.

Maintained by Frederic Bertrand. Last updated 6 months ago.

12.3 match 6 stars 4.91 score 27 scripts

bupaverse

processcheckR:Rule-Based Conformance Checking of Business Process Event Data

Check compliance of event-data from (business) processes with respect to specified rules. Rules supported are of three types: frequency (activities that should (not) happen x number of times), order (succession between activities) and exclusiveness (and and exclusive choice between activities).

Maintained by Gert Janssenswillen. Last updated 2 years ago.

10.5 match 4 stars 5.58 score 32 scripts 1 dependents

statisticsnorway

GaussSuppression:Tabular Data Suppression using Gaussian Elimination

A statistical disclosure control tool to protect tables by suppression using the Gaussian elimination secondary suppression algorithm (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6>. A suggestion is to start by working with functions SuppressSmallCounts() and SuppressDominantCells(). These functions use primary suppression functions for the minimum frequency rule and the dominance rule, respectively. Novel functionality for suppression of disclosive cells is also included. General primary suppression functions can be supplied as input to the general working horse function, GaussSuppressionFromData(). Suppressed frequencies can be replaced by synthetic decimal numbers as described in Langsrud (2019) <doi:10.1007/s11222-018-9848-9>.

Maintained by Øyvind Langsrud. Last updated 5 days ago.

8.8 match 2 stars 6.61 score 50 scripts

tidymodels

parsnip:A Common API to Modeling and Analysis Functions

A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).

Maintained by Max Kuhn. Last updated 7 days ago.

3.5 match 612 stars 16.37 score 3.4k scripts 69 dependents

rubenfcasal

npsp:Nonparametric Spatial Statistics

Multidimensional nonparametric spatial (spatio-temporal) geostatistics. S3 classes and methods for multidimensional: linear binning, local polynomial kernel regression (spatial trend estimation), density and variogram estimation. Nonparametric methods for simultaneous inference on both spatial trend and variogram functions (for spatial processes). Nonparametric residual kriging (spatial prediction). For details on these methods see, for example, Fernandez-Casal and Francisco-Fernandez (2014) <doi:10.1007/s00477-013-0817-8> or Castillo-Paez et al. (2019) <doi:10.1016/j.csda.2019.01.017>.

Maintained by Ruben Fernandez-Casal. Last updated 4 months ago.

geostatistics spatial-data-analysis statistics fortran openblas

10.1 match 4 stars 5.71 score 64 scripts

felixthestudent

cellpypes:Cell Type Pipes for Single-Cell RNA Sequencing Data

Annotate single-cell RNA sequencing data manually based on marker gene thresholds. Find cell type rules (gene+threshold) through exploration, use the popular piping operator '%>%' to reconstruct complex cell type hierarchies. 'cellpypes' models technical noise to find positive and negative cells for a given expression threshold and returns cell type labels or pseudobulks. Cite this package as Frauhammer (2022) <doi:10.5281/zenodo.6555728> and visit <https://github.com/FelixTheStudent/cellpypes> for tutorials and newest features.

Maintained by Felix Frauhammer. Last updated 1 years ago.

celltype-annotation classification-algorithm scrnaseq single-cell-rna-seq

12.9 match 51 stars 4.41 score 8 scripts

john-harrold

ubiquity:PKPD, PBPK, and Systems Pharmacology Modeling Tools

Complete work flow for the analysis of pharmacokinetic pharmacodynamic (PKPD), physiologically-based pharmacokinetic (PBPK) and systems pharmacology models including: creation of ordinary differential equation-based models, pooled parameter estimation, individual/population based simulations, rule-based simulations for clinical trial design and modeling assays, deployment with a customizable 'Shiny' app, and non-compartmental analysis. System-specific analysis templates can be generated and each element includes integrated reporting with 'PowerPoint' and 'Word'.

Maintained by John Harrold. Last updated 20 days ago.

modeling pkpd

7.7 match 13 stars 7.14 score 33 scripts

jrnold

ggthemes:Extra Themes, Scales and Geoms for 'ggplot2'

Some extra themes, geoms, and scales for 'ggplot2'. Provides 'ggplot2' themes and scales that replicate the look of plots by Edward Tufte, Stephen Few, 'Fivethirtyeight', 'The Economist', 'Stata', 'Excel', and 'The Wall Street Journal', among others. Provides 'geoms' for Tufte's box plot and range frame.

Maintained by Jeffrey B. Arnold. Last updated 1 years ago.

data-visualisation ggplot2 ggplot2-themes plot plotting theme visualization

3.4 match 1.3k stars 16.17 score 40k scripts 102 dependents

kliegr

arc:Association Rule Classification

Implements the Classification-based on Association Rules (CBA) algorithm for association rule classification. The package, also described in Hahsler et al. (2019) <doi:10.32614/RJ-2019-048>, contains several convenience methods that allow to automatically set CBA parameters (minimum confidence, minimum support) and it also natively handles numeric attributes by integrating a pre-discretization step. The rule generation phase is handled by the 'arules' package. To further decrease the size of the CBA models produced by the 'arc' package, postprocessing by the 'qCBA' package is suggested.

Maintained by Tomas Kliegr. Last updated 6 months ago.

10.4 match 7 stars 5.09 score 39 scripts 1 dependents

michaelklein916

crso:Cancer Rule Set Optimization ('crso')

An algorithm for identifying candidate driver combinations in cancer. CRSO is based on a theoretical model of cancer in which a cancer rule is defined to be a collection of two or more events (i.e., alterations) that are minimally sufficient to cause cancer. A cancer rule set is a set of cancer rules that collectively are assumed to account for all of ways to cause cancer in the population. In CRSO every event is designated explicitly as a passenger or driver within each patient. Each event is associated with a patient-specific, event-specific passenger penalty, reflecting how unlikely the event would have happened by chance, i.e., as a passenger. CRSO evaluates each rule set by assigning all samples to a rule in the rule set, or to the null rule, and then calculating the total statistical penalty from all unassigned event. CRSO uses a three phase procedure find the best rule set of fixed size K for a range of Ks. A core rule set is then identified from among the best rule sets of size K as the rule set that best balances rule set size and statistical penalty. Users should consult the 'crso' vignette for an example walk through of a full CRSO run. The full description, of the CRSO algorithm is presented in: Klein MI, Cannataro V, Townsend J, Stern DF and Zhao H. "Identifying combinations of cancer driver in individual patients." BioRxiv 674234 [Preprint]. June 19, 2019. <doi:10.1101/674234>. Please cite this article if you use 'crso'.

Maintained by Michael Klein. Last updated 6 years ago.

22.7 match 2.32 score 21 scripts

dmurdoch

parseLatex:Parse 'LaTeX' Code

Exports an enhanced version of the tools::parseLatex() function to handle 'LaTeX' syntax more accurately. Also includes numerous functions for searching and modifying 'LaTeX' source.

Maintained by Duncan Murdoch. Last updated 4 days ago.

11.3 match 1 stars 4.65 score 8 scripts

r-forge

survey:Analysis of Complex Survey Samples

Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.

Maintained by "Thomas Lumley". Last updated 6 months ago.

cpp

3.8 match 1 stars 13.93 score 13k scripts 235 dependents

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

4.1 match 29 stars 12.34 score 6.6k scripts 931 dependents

dfalbel

rslp:A Stemming Algorithm for the Portuguese Language

Implements the "Stemming Algorithm for the Portuguese Language" <DOI:10.1109/SPIRE.2001.10024>.

Maintained by Daniel Falbel. Last updated 5 years ago.

12.3 match 21 stars 4.10 score 12 scripts

jaredhuling

personalized:Estimation and Validation Methods for Subgroup Identification and Personalized Medicine

Provides functions for fitting and validation of models for subgroup identification and personalized medicine / precision medicine under the general subgroup identification framework of Chen et al. (2017) <doi:10.1111/biom.12676>. This package is intended for use for both randomized controlled trials and observational studies and is described in detail in Huling and Yu (2021) <doi:10.18637/jss.v098.i05>.

Maintained by Jared Huling. Last updated 3 years ago.

causal-inference heterogeneity-of-treatment-effect individualized-treatment-rules personalized-medicine precision-medicine subgroup-identification treatment-effects treatment-scoring

6.7 match 32 stars 7.38 score 125 scripts 1 dependents

vonjd

OneR:One Rule Machine Learning Classification Algorithm with Enhancements

Implements the One Rule (OneR) Machine Learning classification algorithm (Holte, R.C. (1993) <doi:10.1023/A:1022631118932>) with enhancements for sophisticated handling of numeric data and missing values together with extensive diagnostic functions. It is useful as a baseline for machine learning models and the rules are often helpful heuristics.

Maintained by Holger von Jouanne-Diedrich. Last updated 7 years ago.

5.6 match 40 stars 8.61 score 296 scripts 2 dependents

traminer

TraMineR:Trajectory Miner: a Sequence Analysis Toolkit

Set of sequence analysis tools for manipulating, describing and rendering categorical sequences, and more generally mining sequence data in the field of social sciences. Although this sequence analysis package is primarily intended for state or event sequences that describe time use or life courses such as family formation histories or professional careers, its features also apply to many other kinds of categorical sequence data. It accepts many different sequence representations as input and provides tools for converting sequences from one format to another. It offers several functions for describing and rendering sequences, for computing distances between sequences with different metrics (among which optimal matching), original dissimilarity-based analysis tools, and functions for extracting the most frequent event subsequences and identifying the most discriminating ones among them. A user's guide can be found on the TraMineR web page.

Maintained by Gilbert Ritschard. Last updated 3 months ago.

cpp

5.9 match 11 stars 8.24 score 534 scripts 13 dependents

softwaredeng

inTrees:Interpret Tree Ensembles

For tree ensembles such as random forests, regularized random forests and gradient boosted trees, this package provides functions for: extracting, measuring and pruning rules; selecting a compact rule set; summarizing rules into a learner; calculating frequent variable interactions; formatting rules in latex code. Reference: Interpreting tree ensembles with inTrees (Houtao Deng, 2019, <doi:10.1007/s41060-018-0144-8>).

Maintained by Houtao Deng. Last updated 11 months ago.

8.2 match 39 stars 5.85 score 72 scripts

cran

frbs:Fuzzy Rule-Based Systems for Classification and Regression Tasks

An implementation of various learning algorithms based on fuzzy rule-based systems (FRBSs) for dealing with classification and regression tasks. Moreover, it allows to construct an FRBS model defined by human experts. FRBSs are based on the concept of fuzzy sets, proposed by Zadeh in 1965, which aims at representing the reasoning of human experts in a set of IF-THEN rules, to handle real-life problems in, e.g., control, prediction and inference, data mining, bioinformatics data processing, and robotics. FRBSs are also known as fuzzy inference systems and fuzzy models. During the modeling of an FRBS, there are two important steps that need to be conducted: structure identification and parameter estimation. Nowadays, there exists a wide variety of algorithms to generate fuzzy IF-THEN rules automatically from numerical data, covering both steps. Approaches that have been used in the past are, e.g., heuristic procedures, neuro-fuzzy techniques, clustering methods, genetic algorithms, squares methods, etc. Furthermore, in this version we provide a universal framework named 'frbsPMML', which is adopted from the Predictive Model Markup Language (PMML), for representing FRBS models. PMML is an XML-based language to provide a standard for describing models produced by data mining and machine learning algorithms. Therefore, we are allowed to export and import an FRBS model to/from 'frbsPMML'. Finally, this package aims to implement the most widely used standard procedures, thus offering a standard package for FRBS modeling to the R community.

Maintained by Christoph Bergmeir. Last updated 5 years ago.

11.3 match 12 stars 4.18 score 1 dependents

msberends

AMR:Antimicrobial Resistance Data Analysis

Functions to simplify and standardise antimicrobial resistance (AMR) data analysis and to work with microbial and antimicrobial properties by using evidence-based methods, as described in <doi:10.18637/jss.v104.i03>.

Maintained by Matthijs S. Berends. Last updated 4 hours ago.

amr antimicrobial-data epidemiology microbiology software

3.9 match 92 stars 11.88 score 182 scripts 6 dependents

data-cleaning

dcmodify:Modify Data Using Externally Defined Modification Rules

Data cleaning scripts typically contain a lot of 'if this change that' type of statements. Such statements are typically condensed expert knowledge. With this package, such 'data modifying rules' are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities.

Maintained by Mark van der Loo. Last updated 9 months ago.

7.2 match 10 stars 6.24 score 58 scripts

bioc

crisprScore:On-Target and Off-Target Scoring Algorithms for CRISPR gRNAs

Provides R wrappers of several on-target and off-target scoring methods for CRISPR guide RNAs (gRNAs). The following nucleases are supported: SpCas9, AsCas12a, enAsCas12a, and RfxCas13d (CasRx). The available on-target cutting efficiency scoring methods are RuleSet1, Azimuth, DeepHF, DeepCpf1, enPAM+GB, and CRISPRscan. Both the CFD and MIT scoring methods are available for off-target specificity prediction. The package also provides a Lindel-derived score to predict the probability of a gRNA to produce indels inducing a frameshift for the Cas9 nuclease. Note that DeepHF, DeepCpf1 and enPAM+GB are not available on Windows machines.

Maintained by Jean-Philippe Fortin. Last updated 5 months ago.

crispr functionalgenomics functionalprediction bioconductor bioconductor-package crispr-cas9 crispr-design crispr-target genomics grna grna-sequence grna-sequences scoring-algorithm sgrna sgrna-design

6.0 match 16 stars 7.52 score 19 scripts 4 dependents

jiefei-wang

aws.ecx:Communicating with AWS EC2 and ECS using AWS REST APIs

Providing the functions for communicating with Amazon Web Services(AWS) Elastic Compute Cloud(EC2) and Elastic Container Service(ECS). The functions will have the prefix 'ecs_' or 'ec2_' depending on the class of the API. The request will be sent via the REST API and the parameters are given by the function argument. The credentials can be set via 'aws_set_credentials'. The EC2 documentation can be found at <https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Welcome.html> and ECS can be found at <https://docs.aws.amazon.com/AmazonECS/latest/APIReference/Welcome.html>.

Maintained by Jiefei Wang. Last updated 3 years ago.

ec2 ecs ecs-functions

10.7 match 1 stars 4.18 score 2 scripts

matsuurakentaro

RLescalation:Optimal Dose Escalation Using Deep Reinforcement Learning

An implementation to compute an optimal dose escalation rule using deep reinforcement learning in phase I oncology trials (Matsuura et al. (2023) <doi:10.1080/10543406.2023.2170402>). The dose escalation rule can directly optimize the percentages of correct selection (PCS) of the maximum tolerated dose (MTD).

Maintained by Kentaro Matsuura. Last updated 1 months ago.

10.6 match 4.18 score

etiennebacher

astgrepr:Parse and Manipulate R Code

Parsing R code is key to build tools such as linters and stylers. This package provides a binding to the Rust crate 'ast-grep' so that one can parse and explore R code.

Maintained by Etienne Bacher. Last updated 1 months ago.

rust cargo

7.8 match 22 stars 5.66 score 4 scripts 1 dependents

anhoej

qicharts2:Quality Improvement Charts

Functions for making run charts, Shewhart control charts and Pareto charts for continuous quality improvement. Included control charts are: I, MR, Xbar, S, T, C, U, U', P, P', and G charts. Non-random variation in the form of minor to moderate persistent shifts in data over time is identified by the Anhoej rules for unusually long runs and unusually few crossing [Anhoej, Olesen (2014) <doi:10.1371/journal.pone.0113825>]. Non-random variation in the form of larger, possibly transient, shifts is identified by Shewhart's 3-sigma rule [Mohammed, Worthington, Woodall (2008) <doi:10.1136/qshc.2004.012047>].

Maintained by Jacob Anhoej. Last updated 1 months ago.

4.9 match 39 stars 9.04 score 122 scripts 2 dependents

mhahsler

arulesNBMiner:Mining NB-Frequent Itemsets and NB-Precise Rules

NBMiner is an implementation of the model-based mining algorithm for mining NB-frequent itemsets and NB-precise rules. Michael Hahsler (2006) <doi:10.1007/s10618-005-0026-2>.

Maintained by Michael Hahsler. Last updated 3 years ago.

association-rules openjdk

12.6 match 6 stars 3.48 score 10 scripts

functionaldata

fdapace:Functional Data Analysis and Empirical Dynamics

A versatile package that provides implementation of various methods of Functional Data Analysis (FDA) and Empirical Dynamics. The core of this package is Functional Principal Component Analysis (FPCA), a key technique for functional data analysis, for sparsely or densely sampled random trajectories and time courses, via the Principal Analysis by Conditional Estimation (PACE) algorithm. This core algorithm yields covariance and mean functions, eigenfunctions and principal component (scores), for both functional data and derivatives, for both dense (functional) and sparse (longitudinal) sampling designs. For sparse designs, it provides fitted continuous trajectories with confidence bands, even for subjects with very few longitudinal observations. PACE is a viable and flexible alternative to random effects modeling of longitudinal data. There is also a Matlab version (PACE) that contains some methods not available on fdapace and vice versa. Updates to fdapace were supported by grants from NIH Echo and NSF DMS-1712864 and DMS-2014626. Please cite our package if you use it (You may run the command citation("fdapace") to get the citation format and bibtex entry). References: Wang, J.L., Chiou, J., Müller, H.G. (2016) <doi:10.1146/annurev-statistics-041715-033624>; Chen, K., Zhang, X., Petersen, A., Müller, H.G. (2017) <doi:10.1007/s12561-015-9137-5>.

Maintained by Yidong Zhou. Last updated 9 months ago.

cpp

3.7 match 31 stars 11.54 score 474 scripts 25 dependents

billdenney

PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis

Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.

Maintained by Bill Denney. Last updated 19 days ago.

nca noncompartmental-analysis pharmacokinetics

3.3 match 73 stars 12.61 score 214 scripts 4 dependents

inbo

checklist:A Thorough and Strict Set of Checks for R Packages and Source Code

An opinionated set of rules for R packages and R source code projects.

Maintained by Thierry Onkelinx. Last updated 29 days ago.

checklist continuous-integration continuous-testing quality-assurance

5.8 match 19 stars 7.24 score 21 scripts 2 dependents

jamesramsay5

fda:Functional Data Analysis

These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <https://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/>.

Maintained by James Ramsay. Last updated 4 months ago.

3.4 match 3 stars 12.29 score 2.0k scripts 143 dependents

agvico

SDEFSR:Subgroup Discovery with Evolutionary Fuzzy Systems

Implementation of evolutionary fuzzy systems for the data mining task called "subgroup discovery". In particular, the algorithms presented in this package are: M. J. del Jesus, P. Gonzalez, F. Herrera, M. Mesonero (2007) <doi:10.1109/TFUZZ.2006.890662> M. J. del Jesus, P. Gonzalez, F. Herrera (2007) <doi:10.1109/MCDM.2007.369416> C. J. Carmona, P. Gonzalez, M. J. del Jesus, F. Herrera (2010) <doi:10.1109/TFUZZ.2010.2060200> C. J. Carmona, V. Ruiz-Rodado, M. J. del Jesus, A. Weber, M. Grootveld, P. González, D. Elizondo (2015) <doi:10.1016/j.ins.2014.11.030> It also provide a Shiny App to ease the analysis. The algorithms work with data sets provided in KEEL, ARFF and CSV format and also with data.frame objects.

Maintained by Angel M. Garcia. Last updated 4 years ago.

16.5 match 2.53 score 34 scripts

rudeboybert

fivethirtyeight:Data and Code Behind the Stories and Interactives at 'FiveThirtyEight'

Datasets and code published by the data journalism website 'FiveThirtyEight' available at <https://github.com/fivethirtyeight/data>. Note that while we received guidance from editors at 'FiveThirtyEight', this package is not officially published by 'FiveThirtyEight'.

Maintained by Albert Y. Kim. Last updated 2 years ago.

data-science datajournalism fivethirtyeight statistics

3.8 match 453 stars 10.98 score 1.7k scripts

red-list-ecosystem

redlistr:Tools for the IUCN Red List of Ecosystems and Species

A toolbox created by members of the International Union for Conservation of Nature (IUCN) Red List of Ecosystems Committee for Scientific Standards. Primarily, it is a set of tools suitable for calculating the metrics required for making assessments of species and ecosystems against the IUCN Red List of Threatened Species and the IUCN Red List of Ecosystems categories and criteria. See the IUCN website for detailed guidelines, the criteria, publications and other information.

Maintained by Calvin Lee. Last updated 1 years ago.

6.4 match 32 stars 6.35 score 35 scripts

benrwoodard

adobeanalyticsr:R Client for 'Adobe Analytics' API 2.0

Connect to the 'Adobe Analytics' API v2.0 <https://github.com/AdobeDocs/analytics-2.0-apis> which powers 'Analysis Workspace'. The package was developed with the analyst in mind, and it will continue to be developed with the guiding principles of iterative, repeatable, timely analysis.

Maintained by Ben Woodard. Last updated 2 months ago.

5.8 match 18 stars 7.02 score 39 scripts

yayayaoyaoyao

RARtrials:Response-Adaptive Randomization in Clinical Trials

Some response-adaptive randomization methods commonly found in literature are included in this package. These methods include the randomized play-the-winner rule for binary endpoint (Wei and Durham (1978) <doi:10.2307/2286290>), the doubly adaptive biased coin design with minimal variance strategy for binary endpoint (Atkinson and Biswas (2013) <doi:10.1201/b16101>, Rosenberger and Lachin (2015) <doi:10.1002/9781118742112>) and maximal power strategy targeting Neyman allocation for binary endpoint (Tymofyeyev, Rosenberger, and Hu (2007) <doi:10.1198/016214506000000906>) and RSIHR allocation with each letter representing the first character of the names of the individuals who first proposed this rule (Youngsook and Hu (2010) <doi:10.1198/sbr.2009.0056>, Bello and Sabo (2016) <doi:10.1080/00949655.2015.1114116>), A-optimal Allocation for continuous endpoint (Sverdlov and Rosenberger (2013) <doi:10.1080/15598608.2013.783726>), Aa-optimal Allocation for continuous endpoint (Sverdlov and Rosenberger (2013) <doi:10.1080/15598608.2013.783726>), generalized RSIHR allocation for continuous endpoint (Atkinson and Biswas (2013) <doi:10.1201/b16101>), Bayesian response-adaptive randomization with a control group using the Thall \& Wathen method for binary and continuous endpoints (Thall and Wathen (2007) <doi:10.1016/j.ejca.2007.01.006>) and the forward-looking Gittins index rule for binary and continuous endpoints (Villar, Wason, and Bowden (2015) <doi:10.1111/biom.12337>, Williamson and Villar (2019) <doi:10.1111/biom.13119>).

Maintained by Chuyao Xu. Last updated 3 months ago.

8.7 match 4.65 score

mclements

rstpm2:Smooth Survival Models, Including Generalized Survival Models

R implementation of generalized survival models (GSMs), smooth accelerated failure time (AFT) models and Markov multi-state models. For the GSMs, g(S(t|x))=eta(t,x) for a link function g, survival S at time t with covariates x and a linear predictor eta(t,x). The main assumption is that the time effect(s) are smooth <doi:10.1177/0962280216664760>. For fully parametric models with natural splines, this re-implements Stata's 'stpm2' function, which are flexible parametric survival models developed by Royston and colleagues. We have extended the parametric models to include any smooth parametric smoothers for time. We have also extended the model to include any smooth penalized smoothers from the 'mgcv' package, using penalized likelihood. These models include left truncation, right censoring, interval censoring, gamma frailties and normal random effects <doi:10.1002/sim.7451>, and copulas. For the smooth AFTs, S(t|x) = S_0(t*eta(t,x)), where the baseline survival function S_0(t)=exp(-exp(eta_0(t))) is modelled for natural splines for eta_0, and the time-dependent cumulative acceleration factor eta(t,x)=\int_0^t exp(eta_1(u,x)) du for log acceleration factor eta_1(u,x). The Markov multi-state models allow for a range of models with smooth transitions to predict transition probabilities, length of stay, utilities and costs, with differences, ratios and standardisation.

Maintained by Mark Clements. Last updated 5 months ago.

fortran openblas cpp

3.6 match 28 stars 11.01 score 137 scripts 50 dependents

harrelfe

Hmisc:Harrell Miscellaneous

Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.

Maintained by Frank E Harrell Jr. Last updated 3 days ago.

fortran

2.3 match 210 stars 17.61 score 17k scripts 750 dependents

transbioinfolab

ranktreeEnsemble:Ensemble Models of Rank-Based Trees with Extracted Decision Rules

Fast computing an ensemble of rank-based trees via boosting or random forest on binary and multi-class problems. It converts continuous gene expression profiles into ranked gene pairs, for which the variable importance indices are computed and adopted for dimension reduction. Decision rules can be extracted from trees.

Maintained by Min Lu. Last updated 10 months ago.

cpp

14.5 match 2.70 score 4 scripts

openintrostat

openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs

Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.

Maintained by Mine Çetinkaya-Rundel. Last updated 3 months ago.

data openintro

3.4 match 240 stars 11.39 score 6.0k scripts

stephenmilborrow

rpart.plot:Plot 'rpart' Models: An Enhanced Version of 'plot.rpart'

Plot 'rpart' models. Extends plot.rpart() and text.rpart() in the 'rpart' package.

Maintained by Stephen Milborrow. Last updated 1 years ago.

4.0 match 5 stars 9.64 score 12k scripts 42 dependents

veseshan

clinfun:Clinical Trial Design and Data Analysis Functions

Utilities to make your clinical collaborations easier if not fun. It contains functions for designing studies such as Simon 2-stage and group sequential designs and for data analysis such as Jonckheere-Terpstra test and estimating survival quantiles.

Maintained by Venkatraman E. Seshan. Last updated 1 years ago.

fortran

4.9 match 5 stars 7.86 score 124 scripts 8 dependents

michaellli

evalITR:Evaluating Individualized Treatment Rules

Provides various statistical methods for evaluating Individualized Treatment Rules under randomized data. The provided metrics include Population Average Value (PAV), Population Average Prescription Effect (PAPE), Area Under Prescription Effect Curve (AUPEC). It also provides the tools to analyze Individualized Treatment Rules under budget constraints. Detailed reference in Imai and Li (2019) <arXiv:1905.05389>.

Maintained by Michael Lingzhi Li. Last updated 2 years ago.

5.7 match 14 stars 6.78 score 36 scripts

oxfordihtm

codeditr:Implementing Cause-of-Death Data Checks Based on the WHO CoDEdit Tool

The World Health Organization's CoDEdit electronic tool is intended to help producers of cause-of-death statistics in strengthening their capacity to perform routine checks on their data. This package ports the original tool built using Microsoft Access into R so as to leverage the utility and function of the original tool into a usable application program interface that can be used for building more universal tools or for creating programmatic scientific workflows aimed at routine, automated, and large-scale monitoring of cause-of-death data.

Maintained by Ernest Guevarra. Last updated 4 months ago.

cod icd

8.3 match 3 stars 4.65 score 6 scripts

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

4.7 match 3 stars 8.20 score 7.8k scripts 11 dependents

epiforecasts

scoringutils:Utilities for Scoring and Assessing Predictions

Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.

Maintained by Nikos Bosse. Last updated 16 days ago.

forecast-evaluation forecasting

3.3 match 52 stars 11.37 score 326 scripts 7 dependents

davidcarayon

IDEATools:Individual and Group Farm Sustainability Assessments using the IDEA4 Method

Collection of tools to automate the processing of data collected though the IDEA4 method (see Zahm et al. (2018) <doi:10.1051/cagri/2019004> ). Starting from the original data collecting files this packages provides functions to compute IDEA indicators, draw modern and aesthetic plots, and produce a wide range of reporting materials.

Maintained by David Carayon. Last updated 9 months ago.

8.3 match 1 stars 4.59 score 26 scripts

coffeemuggler

sandbox:Probabilistic Numerical Modelling of Sediment Properties

A flexible framework for definition and application of time/depth- based rules for sets of parameters for single grains that can be used to create artificial sediment profiles. Such profiles can be used for virtual sample preparation and synthetic, for instance, luminescence measurements.

Maintained by Michael Dietze. Last updated 8 months ago.

9.2 match 1 stars 4.09 score 248 scripts

bioc

rsbml:R support for SBML, using libsbml

Links R to libsbml for SBML parsing, validating output, provides an S4 SBML DOM, converts SBML to R graph objects. Optionally links to the SBML ODE Solver Library (SOSLib) for simulating models.

Maintained by Michael Lawrence. Last updated 21 days ago.

graphandnetwork pathways network libsbml cpp

8.0 match 4.71 score 19 scripts 1 dependents

pharmaverse

admiral:ADaM in R Asset Library

A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).

Maintained by Ben Straub. Last updated 11 hours ago.

cdisc clinical-trials open-source

2.7 match 238 stars 13.95 score 486 scripts 4 dependents

kliegr

qCBA:Postprocessing of Rule Classification Models Learnt on Quantized Data

Implements the Quantitative Classification-based on Association Rules (QCBA) algorithm (<doi:10.1007/s10489-022-04370-x>). QCBA postprocesses rule classification models making them typically smaller and in some cases more accurate. Supported are 'CBA' implementations from 'rCBA', 'arulesCBA' and 'arc' packages, and 'CPAR', 'CMAR', 'FOIL2' and 'PRM' implementations from 'arulesCBA' package and 'SBRL' implementation from the 'sbrl' package. The result of the post-processing is an ordered CBA-like rule list.

Maintained by Tomáš Kliegr. Last updated 6 months ago.

openjdk

8.7 match 11 stars 4.30 score 12 scripts

lme4

lme4:Linear Mixed-Effects Models using 'Eigen' and S4

Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".

Maintained by Ben Bolker. Last updated 6 days ago.

cpp

1.8 match 647 stars 20.69 score 35k scripts 1.5k dependents

drizopoulos

cvGEE:Cross-Validated Predictions from GEE

Calculates predictions from generalized estimating equations and internally cross-validates them using the logarithmic, quadratic and spherical proper scoring rules; Kung-Yee Liang and Scott L. Zeger (1986) <doi:10.1093/biomet/73.1.13>.

Maintained by Dimitris Rizopoulos. Last updated 6 years ago.

8.9 match 3 stars 4.18 score

nikolett0203

RulesTools:Preparing, Analyzing, and Visualizing Association Rules

Streamlines data preprocessing, analysis, and visualization for association rule mining. Designed to work with the 'arules' package, features include discretizing data frames, generating rule set intersections, and visualizing rules with heatmaps and Euler diagrams. 'RulesTools' also includes a dataset on Brook trout detection from Nolan et al. (2022) <doi:10.1007/s13412-022-00800-x>.

Maintained by Nikolett Toth. Last updated 2 months ago.

9.4 match 3.93 score

data-cleaning

validatesuggest:Generate Suggestions for Validation Rules

Generate suggestions for validation rules from a reference data set, which can be used as a starting point for domain specific rules to be checked with package 'validate'.

Maintained by Edwin de Jonge. Last updated 1 years ago.

data-cleaning validation

8.2 match 5 stars 4.40 score 5 scripts

ohdsi

CohortGenerator:Cohort Generation for the OMOP Common Data Model

Generate cohorts and subsets using an Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) Database. Cohorts are defined using 'CIRCE' (<https://github.com/ohdsi/circe-be>) or SQL compatible with 'SqlRender' (<https://github.com/OHDSI/SqlRender>).

Maintained by Anthony Sena. Last updated 6 months ago.

hades openjdk

4.5 match 13 stars 7.91 score 165 scripts

talegari

tidyrules:Utilities to Retrieve Rulelists from Model Fits, Filter, Prune, Reorder and Predict on Unseen Data

Provides a framework to work with decision rules. Rules can be extracted from supported models, augmented with (custom) metrics using validation data, manipulated using standard dataframe operations, reordered and pruned based on a metric, predict on unseen (test) data. Utilities include; Creating a rulelist manually, Exporting a rulelist as a SQL case statement and so on. The package offers two classes; rulelist and ruleset based on dataframe.

Maintained by Srikanth Komala Sheshachala. Last updated 1 months ago.

7.5 match 11 stars 4.75 score 17 scripts

cran

timeDate:Rmetrics - Chronological and Calendar Objects

The 'timeDate' class fulfils the conventions of the ISO 8601 standard as well as of the ANSI C and POSIX standards. Beyond these standards it provides the "Financial Center" concept which allows to handle data records collected in different time zones and mix them up to have always the proper time stamps with respect to your personal financial center, or alternatively to the GMT reference time. It can thus also handle time stamps from historical data records from the same time zone, even if the financial centers changed day light saving times at different calendar dates.

Maintained by Georgi N. Boshnakov. Last updated 6 months ago.

3.8 match 1 stars 9.25 score 944 scripts 708 dependents

andyliaw-mrk

locfit:Local Regression, Likelihood and Density Estimation

Local regression, likelihood and density estimation methods as described in the 1999 book by Loader.

Maintained by Andy Liaw. Last updated 14 days ago.

3.6 match 1 stars 9.40 score 428 scripts 606 dependents

dyfanjones

sagemaker.common:R6sagemaker lower level api calls

`R6sagemaker` lower level api calls.

Maintained by Dyfan Jones. Last updated 3 years ago.

amazon-sagemaker aws sagemaker sdk

12.1 match 2.78 score 4 dependents

rstudio

shinyvalidate:Input Validation for Shiny Apps

Improves the user experience of Shiny apps by helping to provide feedback when required inputs are missing, or input values are not valid.

Maintained by Carson Sievert. Last updated 1 years ago.

shiny ui validation

3.7 match 112 stars 9.10 score 316 scripts 13 dependents

medianasoft

MedianaDesigner:Power and Sample Size Calculations for Clinical Trials

Efficient simulation-based power and sample size calculations are supported for a broad class of late-stage clinical trials. The following modules are included in the package: Adaptive designs with data-driven sample size or event count re-estimation, Adaptive designs with data-driven treatment selection, Adaptive designs with data-driven population selection, Optimal selection of a futility stopping rule, Event prediction in event-driven trials, Adaptive trials with response-adaptive randomization (experimental module), Traditional trials with multiple objectives (experimental module). Traditional trials with cluster-randomized designs (experimental module).

Maintained by Alex Dmitrienko. Last updated 2 years ago.

cpp

8.8 match 20 stars 3.79 score 31 scripts

ai4ci

interfacer:Define and Enforce Contracts for Dataframes as Function Parameters

A dataframe validation framework for package builders who use dataframes as function parameters. It performs checks on column names, coerces data-types, and checks grouping to make sure user inputs conform to a specification provided by the package author. It provides a mechanism for package authors to automatically document supported dataframe inputs and selectively dispatch to functions depending on the format of a dataframe much like S3 does for classes. It also contains some developer tools to make working with and documenting dataframe specifications easier. It helps package developers to improve their documentation and simplifies parameter validation where dataframes are used as function parameters.

Maintained by Robert Challen. Last updated 1 months ago.

5.2 match 2 stars 6.43 score 2 dependents

weiserc

mvQuad:Methods for Multivariate Quadrature

Provides methods to construct multivariate grids, which can be used for multivariate quadrature. This grids can be based on different quadrature rules like Newton-Cotes formulas (trapezoidal-, Simpson's- rule, ...) or Gauss quadrature (Gauss-Hermite, Gauss-Legendre, ...). For the construction of the multidimensional grid the product-rule or the combination- technique can be applied.

Maintained by Constantin Weiser. Last updated 9 years ago.

5.4 match 3 stars 6.15 score 45 scripts 7 dependents

kcf-jackson

sketch:Interactive Sketches

Creates static / animated / interactive visualisations embeddable in R Markdown documents. It implements an R-to-JavaScript transpiler and enables users to write JavaScript applications using the syntax of R.

Maintained by Chun Fung Kwok. Last updated 1 years ago.

javascript js transpiler visualisation

5.8 match 125 stars 5.70 score 27 scripts

microsoft

wpa:Tools for Analysing and Visualising Viva Insights Data

Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.

Maintained by Martin Chan. Last updated 4 months ago.

workplace-analytics

4.9 match 30 stars 6.68 score 39 scripts 1 dependents

easystats

performance:Assessment of Regression Models Performance

Utilities for computing measures to assess model quality, which are not directly provided by R's 'base' or 'stats' packages. These include e.g. measures like r-squared, intraclass correlation coefficient (Nakagawa, Johnson & Schielzeth (2017) <doi:10.1098/rsif.2017.0213>), root mean squared error or functions to check models for overdispersion, singularity or zero-inflation and more. Functions apply to a large variety of regression models, including generalized linear models, mixed effects models and Bayesian models. References: Lüdecke et al. (2021) <doi:10.21105/joss.03139>.

Maintained by Daniel Lüdecke. Last updated 2 days ago.

aic easystats hacktoberfest loo machine-learning mixed-models models performance r2 statistics

2.0 match 1.1k stars 16.18 score 4.3k scripts 47 dependents

bioc

S4Vectors:Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation bioconductor-package core-package

2.0 match 18 stars 16.05 score 1.0k scripts 1.9k dependents

awblocker

fastGHQuad:Fast 'Rcpp' Implementation of Gauss-Hermite Quadrature

Fast, numerically-stable Gauss-Hermite quadrature rules and utility functions for adaptive GH quadrature. See Liu, Q. and Pierce, D. A. (1994) <doi:10.2307/2337136> for a reference on these methods.

Maintained by Alexander W Blocker. Last updated 3 years ago.

openblas cpp

4.0 match 9 stars 8.03 score 57 scripts 69 dependents

jokergoo

sfcurve:2x2, 3x3 and Nxn Space-Filling Curves

Implementation of all possible forms of 2x2 and 3x3 space-filling curves, i.e., the generalized forms of the Hilbert curve <https://en.wikipedia.org/wiki/Hilbert_curve>, the Peano curve <https://en.wikipedia.org/wiki/Peano_curve> and the Peano curve in the meander type (Figure 5 in <https://eudml.org/doc/141086>). It can generates nxn curves expanded from any specific level-1 units. It also implements the H-curve and the three-dimensional Hilbert curve.

Maintained by Zuguang Gu. Last updated 6 months ago.

cpp

6.8 match 1 stars 4.66 score 13 scripts

cumulocity-iot

pmml:Generate PMML for Various Models

The Predictive Model Markup Language (PMML) is an XML-based language which provides a way for applications to define machine learning, statistical and data mining models and to share models between PMML compliant applications. More information about the PMML industry standard and the Data Mining Group can be found at <http://dmg.org/>. The generated PMML can be imported into any PMML consuming application, such as Zementis Predictive Analytics products. The package isofor (used for anomaly detection) can be installed with devtools::install_github("gravesee/isofor").

Maintained by Dmitriy Bolotov. Last updated 3 years ago.

machine-learning pmml zementis

3.9 match 20 stars 7.98 score 560 scripts 1 dependents

ropensci

pixelclasser:Classifies Image Pixels by Colour

Contains functions to classify the pixels of an image file (jpeg or tiff) by its colour. It implements a simple form of the techniques known as Support Vector Machine adapted to this particular problem.

Maintained by Carlos Real. Last updated 4 years ago.

10.3 match 2 stars 3.00 score 8 scripts

amices

mice:Multivariate Imputation by Chained Equations

Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.

Maintained by Stef van Buuren. Last updated 9 days ago.

chained-equations fcs imputation mice missing-data missing-values multiple-imputation multivariate-data cpp

1.9 match 462 stars 16.50 score 10k scripts 154 dependents

pi-kappa-devel

markets:Estimation Methods for Markets in Equilibrium and Disequilibrium

Provides estimation methods for markets in equilibrium and disequilibrium. Supports the estimation of an equilibrium and four disequilibrium models with both correlated and independent shocks. Also provides post-estimation analysis tools, such as aggregation, marginal effect, and shortage calculations. See Karapanagiotis (2024) <doi:10.18637/jss.v108.i02> for an overview of the functionality and examples. The estimation methods are based on full information maximum likelihood techniques given in Maddala and Nelson (1974) <doi:10.2307/1914215>. They are implemented using the analytic derivative expressions calculated in Karapanagiotis (2020) <doi:10.2139/ssrn.3525622>. Standard errors can be estimated by adjusting for heteroscedasticity or clustering. The equilibrium estimation constitutes a case of a system of linear, simultaneous equations. Instead, the disequilibrium models replace the market-clearing condition with a non-linear, short-side rule and allow for different specifications of price dynamics.

Maintained by Pantelis Karapanagiotis. Last updated 1 years ago.

disequilibrium economics finance full-information-maximum-likelihood market-clearing market-models short-side-rule cpp

7.2 match 1 stars 4.30 score 9 scripts

davidgohel

flextable:Functions for Tabular Reporting

Use a grammar for creating and customizing pretty tables. The following formats are supported: 'HTML', 'PDF', 'RTF', 'Microsoft Word', 'Microsoft PowerPoint' and R 'Grid Graphics'. 'R Markdown', 'Quarto' and the package 'officer' can be used to produce the result files. The syntax is the same for the user regardless of the type of output to be produced. A set of functions allows the creation, definition of cell arrangement, addition of headers or footers, formatting and definition of cell content with text and or images. The package also offers a set of high-level functions that allow tabular reporting of statistical models and the creation of complex cross tabulations.

Maintained by David Gohel. Last updated 1 months ago.

docx html5 ms-office-documents rmarkdown table

1.8 match 583 stars 17.04 score 7.3k scripts 119 dependents

r-forge

surveillance:Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena

Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hoehle and Paul (2008) <doi:10.1016/j.csda.2008.02.015>. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) <doi:10.18637/jss.v070.i10>. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4() estimates models for (multivariate) count time series following Paul and Held (2011) <doi:10.1002/sim.4177> and Meyer and Held (2014) <doi:10.1214/14-AOAS743>. twinSIR() models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hoehle (2009) <doi:10.1002/bimj.200900050>. twinstim() estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) <doi:10.1111/j.1541-0420.2011.01684.x>. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) <doi:10.18637/jss.v077.i11>.

Maintained by Sebastian Meyer. Last updated 23 hours ago.

cpp

3.5 match 2 stars 8.83 score 446 scripts 3 dependents

cran

mipplot:An Open-Source Tool for Visualization of Climate Mitigation Scenarios

Generic functions to produce area/bar/box/line plots of data following IAMC (Integrated Assessment Modeling Consortium) submission format.

Maintained by Akimitsu Inoue. Last updated 4 years ago.

11.3 match 2.70 score

gjwgit

rattle:Graphical User Interface for Data Science in R

The R Analytic Tool To Learn Easily (Rattle) provides a collection of utilities functions for the data scientist. A Gnome (RGtk2) based graphical interface is included with the aim to provide a simple and intuitive introduction to R for data science, allowing a user to quickly load data from a CSV file (or via ODBC), transform and explore the data, build and evaluate models, and export models as PMML (predictive modelling markup language) or as scores. A key aspect of the GUI is that all R commands are logged and commented through the log tab. This can be saved as a standalone R script file and as an aid for the user to learn R or to copy-and-paste directly into R itself. Note that RGtk2 and cairoDevice have been archived on CRAN. See <https://rattle.togaware.com> for installation instructions.

Maintained by Graham Williams. Last updated 3 years ago.

3.5 match 16 stars 8.48 score 3.0k scripts 3 dependents

tanaylab

misha:Toolkit for Analysis of Genomic Data

A toolkit for analysis of genomic data. The 'misha' package implements an efficient data structure for storing genomic data, and provides a set of functions for data extraction, manipulation and analysis. Some of the 2D genome algorithms were described in Yaffe and Tanay (2011) <doi:10.1038/ng.947>.

Maintained by Aviezer Lifshitz. Last updated 8 days ago.

genomic-data-analysis cpp

5.1 match 4 stars 5.86 score

lgnbhl

xlcharts:Create Native 'Excel' Charts and Work with Microsoft 'Excel' Files

An R interface to the 'OpenPyXL' 'Python' library to create native 'Excel' charts and work with Microsoft 'Excel' files.

Maintained by Felix Luginbuhl. Last updated 14 days ago.

excel

6.6 match 12 stars 4.48 score 4 scripts

rdatatable

data.table:Extension of `data.frame`

Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.

Maintained by Tyson Barrett. Last updated 2 days ago.

1.3 match 3.7k stars 23.53 score 230k scripts 4.6k dependents

mllg

checkmate:Fast and Versatile Argument Checks

Tests and assertions to perform frequent argument checks. A substantial part of the package was written in C to minimize any worries about execution time overhead.

Maintained by Michel Lang. Last updated 8 months ago.

assertions testthat

1.8 match 276 stars 16.28 score 1.5k scripts 1.9k dependents

fishfollower

stockassessment:State-Space Assessment Model

Fitting SAM...

Maintained by Anders Nielsen. Last updated 16 days ago.

stockassessment cpp

3.8 match 49 stars 7.76 score 324 scripts 2 dependents

nsaph-software

CRE:Interpretable Discovery and Inference of Heterogeneous Treatment Effects

Provides a new method for interpretable heterogeneous treatment effects characterization in terms of decision rules via an extensive exploration of heterogeneity patterns by an ensemble-of-trees approach, enforcing high stability in the discovery. It relies on a two-stage pseudo-outcome regression, and it is supported by theoretical convergence guarantees. Bargagli-Stoffi, F. J., Cadei, R., Lee, K., & Dominici, F. (2023) Causal rule ensemble: Interpretable Discovery and Inference of Heterogeneous Treatment Effects. arXiv preprint <doi:10.48550/arXiv.2009.09036>.

Maintained by Falco Joannes Bargagli Stoffi. Last updated 5 months ago.

4.5 match 13 stars 6.41 score 11 scripts

tidyverse

tibble:Simple Data Frames

Provides a 'tbl_df' class (the 'tibble') with stricter checking and better formatting than the traditional data frame.

Maintained by Kirill Müller. Last updated 1 hours ago.

tidy-data

1.3 match 693 stars 22.85 score 47k scripts 11k dependents

ropensci

targets:Dynamic Function-Oriented 'Make'-Like Declarative Pipelines

Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).

Maintained by William Michael Landau. Last updated 2 days ago.

data-science high-performance-computing make peer-reviewed pipeline r-targetopia reproducibility reproducible-research targets workflow

1.9 match 975 stars 15.18 score 4.6k scripts 22 dependents

edsandorf

obfuscatoR:Obfuscation Game Designs

When people make decisions, they may do so using a wide variety of decision rules. The package allows users to easily create obfuscation games to test the obfuscation hypothesis. It provides an easy to use interface and multiple options designed to vary the difficulty of the game and tailor it to the user's needs. For more detail: Chorus et al., 2021, Obfuscation maximization-based decision-making: Theory, methodology and first empirical evidence, Mathematical Social Sciences, 109, 28-44, <doi:10.1016/j.mathsocsci.2020.10.002>.

Maintained by Erlend Dancke Sandorf. Last updated 2 years ago.

obfuscator

7.0 match 2 stars 4.00 score 5 scripts

flr

FLRef:Reference point computation for advice rules

Blah

Maintained by Henning Winker. Last updated 11 days ago.

8.1 match 3 stars 3.45 score 11 scripts

data-cleaning

editrules:Parsing, Applying, and Manipulating Data Cleaning Rules

Please note: active development has moved to packages 'validate' and 'errorlocate'. Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the 'igraph' package.

Maintained by Edwin de Jonge. Last updated 9 months ago.

4.0 match 22 stars 6.97 score 140 scripts 1 dependents

cran

DevTreatRules:Develop Treatment Rules with Observational Data

Develop and evaluate treatment rules based on: (1) the standard indirect approach of split-regression, which fits regressions separately in both treatment groups and assigns an individual to the treatment option under which predicted outcome is more desirable; (2) the direct approach of outcome-weighted-learning proposed by Yingqi Zhao, Donglin Zeng, A. John Rush, and Michael Kosorok (2012) <doi:10.1080/01621459.2012.695674>; (3) the direct approach, which we refer to as direct-interactions, proposed by Shuai Chen, Lu Tian, Tianxi Cai, and Menggang Yu (2017) <doi:10.1111/biom.12676>. Please see the vignette for a walk-through of how to start with an observational dataset whose design is understood scientifically and end up with a treatment rule that is trustworthy statistically, along with an estimation of rule benefit in an independent sample.

Maintained by Jeremy Roth. Last updated 5 years ago.

13.7 match 2.00 score

csafe-isu

cmcR:An Implementation of the 'Congruent Matching Cells' Method

An open-source implementation of the 'Congruent Matching Cells' method for cartridge case identification as proposed by Song (2013) <https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=911193> as well as an extension of the method proposed by Tong et al. (2015) <doi:10.6028/jres.120.008>. Provides a wide range of pre, inter, and post-processing options when working with cartridge case scan data and their associated comparisons. See the cmcR package website for more details and examples.

Maintained by Joe Zemmels. Last updated 2 years ago.

4.7 match 4 stars 5.68 score 60 scripts

missvalteam

Iscores:Proper Scoring Rules for Missing Value Imputation

Implementation of a KL-based scoring rule to assess the quality of different missing value imputations in the broad sense as introduced in Michel et al. (2021) <arXiv:2106.03742>.

Maintained by Loris Michel. Last updated 2 years ago.

imputation-methods machine-learning missing-values random-forest

6.9 match 7 stars 3.91 score 23 scripts

philipppro

measures:Performance Measures for Statistical Learning

Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.

Maintained by Philipp Probst. Last updated 4 years ago.

6.0 match 1 stars 4.47 score 88 scripts 2 dependents

dashaub

supervisedPRIM:Supervised Classification Learning and Prediction using Patient Rule Induction Method (PRIM)

The Patient Rule Induction Method (PRIM) is typically used for "bump hunting" data mining to identify regions with abnormally high concentrations of data with large or small values. This package extends this methodology so that it can be applied to binary classification problems and used for prediction.

Maintained by David Shaub. Last updated 8 years ago.

patient-rules-induction supervised-learning

9.9 match 1 stars 2.70 score 4 scripts

corels

corels:R Binding for the 'Certifiably Optimal RulE ListS (Corels)' Learner

The 'Certifiably Optimal RulE ListS (Corels)' learner by Angelino et al described in <doi:10.48550/arXiv.1704.01701> provides interpretable decision rules with an optimality guarantee, and is made available to R with this package. See the file 'AUTHORS' for a list of copyright holders and contributors.

Maintained by Dirk Eddelbuettel. Last updated 4 months ago.

gmp cpp

5.2 match 48 stars 5.10 score 13 scripts

rstudio

sass:Syntactically Awesome Style Sheets ('Sass')

An 'SCSS' compiler, powered by the 'LibSass' library. With this, R developers can use variables, inheritance, and functions to generate dynamic style sheets. The package uses the 'Sass CSS' extension language, which is stable, powerful, and CSS compatible.

Maintained by Carson Sievert. Last updated 11 months ago.

cpp

1.7 match 101 stars 15.56 score 252 scripts 4.3k dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 9 days ago.

autograd deep-learning torch cpp

1.6 match 520 stars 16.52 score 1.4k scripts 38 dependents

tbep-tech

tbeptools:Data and Indicators for the Tampa Bay Estuary Program

Several functions are provided for working with Tampa Bay Estuary Program data and indicators, including the water quality report card, tidal creek assessments, Tampa Bay Nekton Index, Tampa Bay Benthic Index, seagrass transect data, habitat report card, and fecal indicator bacteria. Additional functions are provided for miscellaneous tasks, such as reference library curation.

Maintained by Marcus Beck. Last updated 12 days ago.

data-analysis tampa-bay tbep water-quality

3.3 match 10 stars 7.86 score 133 scripts

bastistician

polyCub:Cubature over Polygonal Domains

Numerical integration of continuously differentiable functions f(x,y) over simple closed polygonal domains. The following cubature methods are implemented: product Gauss cubature (Sommariva and Vianello, 2007, <doi:10.1007/s10543-007-0131-2>), the simple two-dimensional midpoint rule (wrapping 'spatstat.geom' functions), and adaptive cubature for radially symmetric functions via line integrate() along the polygon boundary (Meyer and Held, 2014, <doi:10.1214/14-AOAS743>, Supplement B). For simple integration along the axes, the 'cubature' package is more appropriate.

Maintained by Sebastian Meyer. Last updated 1 months ago.

numerical-integration polygons

3.5 match 10 stars 7.19 score 26 scripts 5 dependents

tidymodels

dials:Tools for Creating Tuning Parameter Values

Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.

Maintained by Hannah Frick. Last updated 1 months ago.

1.7 match 114 stars 14.31 score 426 scripts 52 dependents

profyliu

bsnsing:Bsnsing: A Decision Tree Induction Method Based on Recursive Optimal Boolean Rule Composition

The bsnsing package provides functions for training a decision tree classifier, making predictions and generating latex code for plotting. It solves the two-class and multi-class classification problems under the supervised learning paradigm. While building a decision tree, bsnsing uses a Boolean rule involving multiple variables to split a node. Each split rule is identified by solving an optimization problem. Use the bsnsing function to build a tree, the predict function to make predictions and the show function to plot the tree. The paper is at <arXiv:2205.15263>. Source code and more data sets are at <https://github.com/profyliu/bsnsing>.

Maintained by Yanchao Liu. Last updated 3 years ago.

6.9 match 7 stars 3.54 score 1 scripts

hneth

ds4psy:Data Science for Psychologists

All datasets and functions required for the examples and exercises of the book "Data Science for Psychologists" (by Hansjoerg Neth, Konstanz University, 2023), freely available at <https://bookdown.org/hneth/ds4psy/>. The book and course introduce principles and methods of data science to students of psychology and other biological or social sciences. The 'ds4psy' package primarily provides datasets, but also functions for data generation and manipulation (e.g., of text and time data) and graphics that are used in the book and its exercises. All functions included in 'ds4psy' are designed to be explicit and instructive, rather than efficient or elegant.

Maintained by Hansjoerg Neth. Last updated 1 months ago.

data-literacy data-science education exploratory-data-analysis psychology social-sciences visualisation

3.5 match 22 stars 6.79 score 70 scripts

drizopoulos

GLMMadaptive:Generalized Linear Mixed Models using Adaptive Gaussian Quadrature

Fits generalized linear mixed models for a single grouping factor under maximum likelihood approximating the integrals over the random effects with an adaptive Gaussian quadrature rule; Jose C. Pinheiro and Douglas M. Bates (1995) <doi:10.1080/10618600.1995.10474663>.

Maintained by Dimitris Rizopoulos. Last updated 9 days ago.

generalized-linear-mixed-models mixed-effects-models mixed-models

2.3 match 61 stars 10.37 score 212 scripts 5 dependents

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 7 days ago.

immunooncology microarray sequencing metabolomics metagenomics proteomics geneprediction multiplecomparison classification regression bioconductor genomics genomics-data genomics-visualization multivariate-analysis multivariate-statistics omics r-pkg r-project

1.8 match 182 stars 13.71 score 1.3k scripts 22 dependents

hdarjus

sparvaride:Variance Identification in Sparse Factor Analysis

This is an implementation of the algorithm described in Section 3 of Hosszejni and Frühwirth-Schnatter (2022) <doi:10.48550/arXiv.2211.00671>. The algorithm is used to verify that the counting rule CR(r,1) holds for the sparsity pattern of the transpose of a factor loading matrix. As detailed in Section 2 of the same paper, if CR(r,1) holds, then the idiosyncratic variances are generically identified. If CR(r,1) does not hold, then we do not know whether the idiosyncratic variances are identified or not.

Maintained by Darjus Hosszejni. Last updated 2 years ago.

econometrics factor-analysis latent-factors parameter-identification cpp

6.5 match 1 stars 3.70 score 4 scripts

homerhanumat

tigerstats:R Functions for Elementary Statistics

A collection of data sets and functions that are useful in the teaching of statistics at an elementary level to students who may have little or no previous experience with the command line. The functions for elementary inferential procedures follow a uniform interface for user input. Some of the functions are instructional applets that can only be run on the R Studio integrated development environment with package 'manipulate' installed. Other instructional applets are Shiny apps that may be run locally. In teaching the package is used alongside of package 'mosaic', 'mosaicData' and 'abd', which are therefore listed as dependencies.

Maintained by Homer White. Last updated 4 years ago.

4.1 match 16 stars 5.77 score 327 scripts

r-lib

vctrs:Vector Helpers

Defines new notions of prototype and size that are used to provide tools for consistent and well-founded type-coercion and size-recycling, and are in turn connected to ideas of type- and size-stability useful for analysing function interfaces.

Maintained by Davis Vaughan. Last updated 5 months ago.

s3-vectors

1.3 match 290 stars 18.97 score 1.1k scripts 13k dependents

dnzmarcio

ewoc:Escalation with Overdose Control

An implementation of a variety of escalation with overdose control designs introduced by Babb, Rogatko and Zacks (1998) <doi:10.1002/(SICI)1097-0258(19980530)17:10%3C1103::AID-SIM793%3E3.0.CO;2-9>. It calculates the next dose as a clinical trial proceeds and performs simulations to obtain operating characteristics.

Maintained by Marcio A. Diniz. Last updated 3 years ago.

jags cpp

7.1 match 2 stars 3.30 score 20 scripts

bioc

MiPP:Misclassification Penalized Posterior Classification

This package finds optimal sets of genes that seperate samples into two or more classes.

Maintained by Sukwoo Kim. Last updated 5 months ago.

microarray classification

6.5 match 3.60 score 1 scripts

tazinho

snakecase:Convert Strings into any Case

A consistent, flexible and easy to use tool to parse and convert strings into cases like snake or camel among others.

Maintained by Malte Grosser. Last updated 2 years ago.

camelcase case conversion pascalcase snake-case

1.7 match 150 stars 13.99 score 744 scripts 290 dependents

j-ll

FisPro:Fuzzy Inference System Design and Optimization

Fuzzy inference systems are based on fuzzy rules, which have a good capability for managing progressive phenomenons. This package is a basic implementation of the main functions to use a Fuzzy Inference System (FIS) provided by the open source software 'FisPro' <https://www.fispro.org>. 'FisPro' allows to create fuzzy inference systems and to use them for reasoning purposes, especially for simulating a physical or biological system.

Maintained by Jean-Luc Lablée. Last updated 2 years ago.

cpp

9.3 match 2.48 score 4 scripts 1 dependents

uribo

textlintr:Natural Language Linter Tools for 'R Markdown' and R Code

What the package does (one paragraph).

Maintained by Shinya Uryu. Last updated 2 years ago.

lint natural-language-processing

7.8 match 9 stars 2.95 score 4 scripts

luca-scr

qcc:Quality Control Charts

Shewhart quality control charts for continuous, attribute and count data. Cusum and EWMA charts. Operating characteristic curves. Process capability analysis. Pareto chart and cause-and-effect chart. Multivariate control charts.

Maintained by Luca Scrucca. Last updated 2 years ago.

2.0 match 46 stars 11.29 score 730 scripts 6 dependents

cran

mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.

Maintained by Simon Wood. Last updated 1 years ago.

openblas openmp

1.8 match 32 stars 12.71 score 17k scripts 7.8k dependents

bcjaeger

table.glue:Make and Apply Customized Rounding Specifications for Tables

Translate double and integer valued data into character values formatted for tabulation in manuscripts or other types of academic reports.

Maintained by Byron Jaeger. Last updated 4 months ago.

3.8 match 7 stars 5.92 score 60 scripts

vladtarko

cellularautomata:Cellular Automata

Create cellular automata from Wolfram rules. Allows the creation of Wolfram style plots, as well as of animations. Easy to create multiple plots, for example the output of a rule with different initial states, or the output of many different rules from the same state. The output of a cellular automaton is given as a matrix, making it easy to try to explore the possibility of predicting its time evolution using various statistical tools available in R.

Maintained by Vlad Tarko. Last updated 4 months ago.

5.9 match 3.70 score 2 scripts

matthias-da

robCompositions:Compositional Data Analysis

Methods for analysis of compositional data including robust methods (<doi:10.1007/978-3-319-96422-5>), imputation of missing values (<doi:10.1016/j.csda.2009.11.023>), methods to replace rounded zeros (<doi:10.1080/02664763.2017.1410524>, <doi:10.1016/j.chemolab.2016.04.011>, <doi:10.1016/j.csda.2012.02.012>), count zeros (<doi:10.1177/1471082X14535524>), methods to deal with essential zeros (<doi:10.1080/02664763.2016.1182135>), (robust) outlier detection for compositional data, (robust) principal component analysis for compositional data, (robust) factor analysis for compositional data, (robust) discriminant analysis for compositional data (Fisher rule), robust regression with compositional predictors, functional data analysis (<doi:10.1016/j.csda.2015.07.007>) and p-splines (<doi:10.1016/j.csda.2015.07.007>), contingency (<doi:10.1080/03610926.2013.824980>) and compositional tables (<doi:10.1111/sjos.12326>, <doi:10.1111/sjos.12223>, <doi:10.1080/02664763.2013.856871>) and (robust) Anderson-Darling normality tests for compositional data as well as popular log-ratio transformations (addLR, cenLR, isomLR, and their inverse transformations). In addition, visualisation and diagnostic tools are implemented as well as high and low-level plot functions for the ternary diagram.

Maintained by Matthias Templ. Last updated 29 days ago.

cpp

2.4 match 11 stars 9.19 score 226 scripts 2 dependents

adaemmerp

lpirfs:Local Projections Impulse Response Functions

Provides functions to estimate and visualize linear as well as nonlinear impulse responses based on local projections by Jordà (2005) <doi:10.1257/0002828053828518>. The methods and the package are explained in detail in Adämmer (2019) <doi:10.32614/RJ-2019-052>.

Maintained by Philipp Adämmer. Last updated 3 days ago.

openblas cpp

3.4 match 44 stars 6.38 score 108 scripts

r-lib

tidyselect:Select from a Set of Strings

A backend for the selecting functions of the 'tidyverse'. It makes it easy to implement select-like functions in your own packages in a way that is consistent with other 'tidyverse' interfaces for selection.

Maintained by Lionel Henry. Last updated 4 months ago.

1.2 match 130 stars 18.31 score 1.9k scripts 8.2k dependents

cran

perm:Exact or Asymptotic Permutation Tests

Perform Exact or Asymptotic permutation tests [see Fay and Shaw <doi:10.18637/jss.v036.i02>].

Maintained by Michael P. Fay. Last updated 2 years ago.

5.6 match 3.79 score 9 dependents

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

1.7 match 83 stars 12.50 score 186 scripts 9 dependents

teunbrand

ggh4x:Hacks for 'ggplot2'

A 'ggplot2' extension that does a variety of little helpful things. The package extends 'ggplot2' facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.

Maintained by Teun van den Brand. Last updated 2 hours ago.

ggplot-extension ggplot2

1.5 match 617 stars 14.06 score 4.4k scripts 21 dependents

cliffordlai

bestglm:Best Subset GLM and Regression Utilities

Best subset glm using information criteria or cross-validation, carried by using 'leaps' algorithm (Furnival and Wilson, 1974) <doi:10.2307/1267601> or complete enumeration (Morgan and Tatar, 1972) <doi:10.1080/00401706.1972.10488918>. Implements PCR and PLS using AIC/BIC. Implements one-standard deviation rule for use with the 'caret' package.

Maintained by Yuanhao Lai. Last updated 5 years ago.

4.0 match 5.29 score 418 scripts 5 dependents

pkrog

sched:Request Scheduler

Offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP (Simple Object Access Protocol) or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult class permits to get the request status to handle error cases and the content.

Maintained by Pierrick Roger. Last updated 6 months ago.

7.6 match 2.78 score 2 scripts

coffeemuggler

caTools:Tools: Moving Window Statistics, GIF, Base64, ROC AUC, etc

Contains several basic utility functions including: moving (rolling, running) window statistic functions, read/write for GIF and ENVI binary files, fast calculation of AUC, LogitBoost classifier, base64 encoder/decoder, round-off-error-free sum and cumsum, etc.

Maintained by Michael Dietze. Last updated 7 months ago.

cpp

1.9 match 8 stars 11.17 score 9.1k scripts 566 dependents

mlr-org

mlr3tuning:Hyperparameter Optimization for 'mlr3'

Hyperparameter optimization package of the 'mlr3' ecosystem. It features highly configurable search spaces via the 'paradox' package and finds optimal hyperparameter configurations for any 'mlr3' learner. 'mlr3tuning' works with several optimization algorithms e.g. Random Search, Iterated Racing, Bayesian Optimization (in 'mlr3mbo') and Hyperband (in 'mlr3hyperband'). Moreover, it can automatically optimize learners and estimate the performance of optimized models with nested resampling.

Maintained by Marc Becker. Last updated 3 months ago.

bbotk hyperparameter-optimization hyperparameter-tuning machine-learning mlr3 optimization tune tuning

1.8 match 55 stars 11.53 score 384 scripts 11 dependents

chris31415926535

tardis:Text Analysis with Rules and Dictionaries for Inferring Sentiment

Measure text's sentiment with dictionaries and simple rules covering negations and modifiers. User-supplied dictionaries are supported, including Unicode emojis and multi-word tokens, so this package can also be used to study constructs beyond sentiment.

Maintained by Christopher Belanger. Last updated 2 years ago.

nlp sentiment-analysis cpp

5.1 match 2 stars 4.00 score 10 scripts

maialba3

LipidMS:Lipid Annotation for LC-MS/MS DDA or DIA Data

Lipid annotation in untargeted LC-MS lipidomics based on fragmentation rules. Alcoriza-Balaguer MI, Garcia-Canaveras JC, Lopez A, Conde I, Juan O, Carretero J, Lahoz A (2019) <doi:10.1021/acs.analchem.8b03409>.

Maintained by M Isabel Alcoriza-Balaguer. Last updated 7 months ago.

cpp

3.9 match 2 stars 5.33 score 12 scripts 1 dependents

andrisignorell

ModTools:Building Regression and Classification Models

Consistent user interface to the most common regression and classification algorithms, such as random forest, neural networks, C5 trees and support vector machines, complemented with a handful of auxiliary functions, such as variable importance and a tuning function for the parameters.

Maintained by Andri Signorell. Last updated 2 months ago.

4.9 match 2 stars 4.20 score 3 scripts

timobechger

dexterMST:CML and Bayesian Calibration of Multistage Tests

Conditional Maximum Likelihood Calibration and data management of multistage tests. Supports polytomous items and incomplete designs with linear as well as multistage tests. Extended Nominal Response and Interaction models, DIF and profile analysis. See Robert J. Zwitser and Gunter Maris (2015)<doi:10.1007/s11336-013-9369-6>.

Maintained by Timo Bechger. Last updated 1 years ago.

cpp openmp

6.9 match 2.93 score 17 scripts

carloshellin

LearningRlab:Statistical Learning Functions

Aids in learning statistical functions incorporating the result of calculus done with each function and how they are obtained, that is, which equation and variables are used. Also for all these equations and their related variables detailed explanations and interactive exercises are also included. All these characteristics allow to the package user to improve the learning of statistics basics by means of their use.

Maintained by Carlos Javier Hellin Asensio. Last updated 2 years ago.

5.5 match 3.64 score 44 scripts

azure

AzureVM:Virtual Machines in 'Azure'

Functionality for working with virtual machines (VMs) in Microsoft's 'Azure' cloud: <https://azure.microsoft.com/en-us/services/virtual-machines/>. Includes facilities to deploy, startup, shutdown, and cleanly delete VMs and VM clusters. Deployment configurations can be highly customised, and can make use of existing resources as well as creating new ones. A selection of predefined configurations is provided to allow easy deployment of commonly used Linux and Windows images, including Data Science Virtual Machines. With a running VM, execute scripts and install optional extensions. Part of the 'AzureR' family of packages.

Maintained by Hong Ooi. Last updated 2 years ago.

azure azure-sdk-r azure-virtual-machine data-science-virtual-machine

4.0 match 14 stars 5.05 score 16 scripts

cran

GameTheory:Cooperative Game Theory

Implementation of a common set of punctual solutions for Cooperative Game Theory.

Maintained by Sebastian Cano-Berlanga. Last updated 1 years ago.

20.1 match 1.00 score

ropensci

drake:A Pipeline Toolkit for Reproducible Computation at Scale

A general-purpose computational engine for data analysis, drake rebuilds intermediate data objects when their dependencies change, and it skips work when the results are already up to date. Not every execution starts from scratch, there is native support for parallel and distributed computing, and completed projects have tangible evidence that they are reproducible. Extensive documentation, from beginner-friendly tutorials to practical examples and more, is available at the reference website <https://docs.ropensci.org/drake/> and the online manual <https://books.ropensci.org/drake/>.

Maintained by William Michael Landau. Last updated 4 months ago.

data-science drake high-performance-computing makefile peer-reviewed pipeline reproducibility reproducible-research ropensci workflow

1.8 match 1.3k stars 11.49 score 1.7k scripts 1 dependents

maelstrom-research

Rmonize:Support Retrospective Harmonization of Data

Functions to support rigorous retrospective data harmonization processing, evaluation, and documentation across datasets from different studies based on Maelstrom Research guidelines. The package includes the core functions to evaluate and format the main inputs that define the harmonization process, apply specified processing rules to generate harmonized data, diagnose processing errors, and summarize and evaluate harmonized outputs. The main inputs that define the processing are a DataSchema (list and definitions of harmonized variables to be generated) and Data Processing Elements (processing rules to be applied to generate harmonized variables from study-specific variables). The main outputs of processing are harmonized datasets, associated metadata, and tabular and visual summary reports. As described in Maelstrom Research guidelines for rigorous retrospective data harmonization (Fortier I and al. (2017) <doi:10.1093/ije/dyw075>).

Maintained by Guillaume Fabre. Last updated 12 months ago.

3.6 match 5 stars 5.58 score 51 scripts

schochastics

networkdata:Repository of Network Datasets

The package contains a large collection of network dataset with different context. This includes social networks, animal networks and movie networks. All datasets are in 'igraph' format.

Maintained by David Schoch. Last updated 12 months ago.

dataset network-analysis

4.0 match 143 stars 5.01 score 143 scripts

spatstat

spatstat.utils:Utility Functions for 'spatstat'

Contains utility functions for the 'spatstat' family of packages which may also be useful for other purposes.

Maintained by Adrian Baddeley. Last updated 5 days ago.

spatial-analysis spatial-data spatstat

1.7 match 5 stars 11.66 score 134 scripts 248 dependents

kurthornik

RWeka:R/Weka Interface

An R interface to Weka (Version 3.9.3). Weka is a collection of machine learning algorithms for data mining tasks written in Java, containing tools for data pre-processing, classification, regression, clustering, association rules, and visualization. Package 'RWeka' contains the interface code, the Weka jar is in a separate package 'RWekajars'. For more information on Weka see <https://www.cs.waikato.ac.nz/ml/weka/>.

Maintained by Kurt Hornik. Last updated 2 years ago.

openjdk

2.4 match 4 stars 8.24 score 1.8k scripts 14 dependents

rrprf

tablesgg:Presentation-Quality Tables, Displayed Using 'ggplot2'

Presentation-quality tables are displayed as plots on an R graphics device. Although there are other packages that format tables for display, this package is unique in combining two features: (a) It is aware of the logical structure of the table being presented, and makes use of that for automatic layout and styling of the table. This avoids the need for most manual adjustments to achieve an attractive result. (b) It displays tables using 'ggplot2' graphics. Therefore a table can be presented anywhere a graph could be, with no more effort. External software such as LaTeX or HTML or their viewers is not required. The package provides a full set of tools to control the style and appearance of tables, including titles, footnotes and reference marks, horizontal and vertical rules, and spacing of rows and columns. Methods are included to display matrices; data frames; tables created by R's ftable(), table(), and xtabs() functions; and tables created by the 'tables' and 'xtable' packages. Methods can be added to display other table-like objects. A vignette is included that illustrates usage and options available in the package.

Maintained by Richard Raubertas. Last updated 4 years ago.

5.3 match 3.70 score 5 scripts

robinhankin

clifford:Arbitrary Dimensional Clifford Algebras

A suite of routines for Clifford algebras, using the 'Map' class of the Standard Template Library. Canonical reference: Hestenes (1987, ISBN 90-277-1673-0, "Clifford algebra to geometric calculus"). Special cases including Lorentz transforms, quaternion multiplication, and Grassmann algebra, are discussed. Vignettes presenting conformal geometric algebra, quaternions and split quaternions, dual numbers, and Lorentz transforms are included. The package follows 'disordR' discipline.

Maintained by Robin K. S. Hankin. Last updated 1 months ago.

cpp

2.9 match 5 stars 6.71 score 4 scripts

cotterell

TDCM:The Transition Diagnostic Classification Model Framework

Estimate the transition diagnostic classification model (TDCM) described in Madison & Bradshaw (2018) <doi:10.1007/s11336-018-9638-5>, a longitudinal extension of the log-linear cognitive diagnosis model (LCDM) in Henson, Templin & Willse (2009) <doi:10.1007/s11336-008-9089-5>. As the LCDM subsumes many other diagnostic classification models (DCMs), many other DCMs can be estimated longitudinally via the TDCM. The 'TDCM' package includes functions to estimate the single-group and multigroup TDCM, summarize results of interest including item parameters, growth proportions, transition probabilities, transitional reliability, attribute correlations, model fit, and growth plots.

Maintained by Michael E. Cotterell. Last updated 2 days ago.

statistics

4.3 match 4.60 score 5 scripts