Showing 65 of total 65 results (show query)
pcbrendel
multibias:Simultaneous Multi-Bias Adjustment
Quantify the causal effect of a binary exposure on a binary outcome with adjustment for multiple biases. The functions can simultaneously adjust for any combination of uncontrolled confounding, exposure/outcome misclassification, and selection bias. The underlying method generalizes the concept of combining inverse probability of selection weighting with predictive value weighting. Simultaneous multi-bias analysis can be used to enhance the validity and transparency of real-world evidence obtained from observational, longitudinal studies. Based on the work from Paul Brendel, Aracelis Torres, and Onyebuchi Arah (2023) <doi:10.1093/ije/dyad001>.
Maintained by Paul Brendel. Last updated 21 days ago.
causal-inferencecausal-modelsepidemiology
54.8 match 5.34 score 7 scriptskimberlywebb
COMBO:Correcting Misclassified Binary Outcomes in Association Studies
Use frequentist and Bayesian methods to estimate parameters from a binary outcome misclassification model. These methods correct for the problem of "label switching" by assuming that the sum of outcome sensitivity and specificity is at least 1. A description of the analysis methods is available in Hochstedler and Wells (2023) <doi:10.48550/arXiv.2303.10215>.
Maintained by Kimberly Hochstedler Webb. Last updated 20 days ago.
33.5 match 1 stars 5.08 score 4 scriptskimberlywebb
COMMA:Correcting Misclassified Mediation Analysis
Use three methods to estimate parameters from a mediation analysis with a binary misclassified mediator. These methods correct for the problem of "label switching" using Youden's J criteria. A detailed description of the analysis methods is available in Webb and Wells (2024), "Effect estimation in the presence of a misclassified binary mediator" <doi:10.48550/arXiv.2407.06970>.
Maintained by Kimberly Webb. Last updated 3 months ago.
25.7 match 5.18 score 7 scriptsdhaine
episensr:Basic Sensitivity Analysis of Epidemiological Results
Basic sensitivity analysis of the observed relative risks adjusting for unmeasured confounding and misclassification of the exposure/outcome, or both. It follows the bias analysis methods and examples from the book by Lash T.L, Fox M.P, and Fink A.K. "Applying Quantitative Bias Analysis to Epidemiologic Data", ('Springer', 2021).
Maintained by Denis Haine. Last updated 1 years ago.
biasepidemiologysensitivity-analysisstatistics
17.3 match 13 stars 6.48 score 39 scripts 1 dependentsdcgerard
updog:Flexible Genotyping for Polyploids
Implements empirical Bayes approaches to genotype polyploids from next generation sequencing data while accounting for allele bias, overdispersion, and sequencing error. The main functions are flexdog() and multidog(), which allow the specification of many different genotype distributions. Also provided are functions to simulate genotypes, rgeno(), and read-counts, rflexdog(), as well as functions to calculate oracle genotyping error rates, oracle_mis(), and correlation with the true genotypes, oracle_cor(). These latter two functions are useful for read depth calculations. Run browseVignettes(package = "updog") in R for example usage. See Gerard et al. (2018) <doi:10.1534/genetics.118.301468> and Gerard and Ferrao (2020) <doi:10.1093/bioinformatics/btz852> for details on the implemented methods.
Maintained by David Gerard. Last updated 1 years ago.
8.2 match 28 stars 8.45 score 83 scripts 2 dependentsmayamathur
EValue:Sensitivity Analyses for Unmeasured Confounding and Other Biases in Observational Studies and Meta-Analyses
Conducts sensitivity analyses for unmeasured confounding, selection bias, and measurement error (individually or in combination; VanderWeele & Ding (2017) <doi:10.7326/M16-2607>; Smith & VanderWeele (2019) <doi:10.1097/EDE.0000000000001032>; VanderWeele & Li (2019) <doi:10.1093/aje/kwz133>; Smith & VanderWeele (2021) <arXiv:2005.02908>). Also conducts sensitivity analyses for unmeasured confounding in meta-analyses (Mathur & VanderWeele (2020a) <doi:10.1080/01621459.2018.1529598>; Mathur & VanderWeele (2020b) <doi:10.1097/EDE.0000000000001180>) and for additive measures of effect modification (Mathur et al., under review).
Maintained by Maya B. Mathur. Last updated 3 years ago.
6.0 match 3 stars 6.35 score 99 scripts 1 dependentswolfganglederer
simex:SIMEX- And MCSIMEX-Algorithm for Measurement Error Models
Implementation of the SIMEX-Algorithm by Cook & Stefanski (1994) <doi:10.1080/01621459.1994.10476871> and MCSIMEX by Küchenhoff, Mwalili & Lesaffre (2006) <doi:10.1111/j.1541-0420.2005.00396.x>.
Maintained by Wolfgang Lederer. Last updated 6 years ago.
5.3 match 11 stars 6.68 score 75 scripts 7 dependentsaryanrzn
ATE.ERROR:Estimating ATE with Misclassified Outcomes and Mismeasured Covariates
Addressing measurement error in covariates and misclassification in binary outcome variables within causal inference, the 'ATE.ERROR' package implements inverse probability weighted estimation methods proposed by Shu and Yi (2017, <doi:10.1177/0962280217743777>; 2019, <doi:10.1002/sim.8073>). These methods correct errors to accurately estimate average treatment effects (ATE). The package includes two main functions: ATE.ERROR.Y() for handling misclassification in the outcome variable and ATE.ERROR.XY() for correcting both outcome misclassification and covariate measurement error. It employs logistic regression for treatment assignment and uses bootstrap sampling to calculate standard errors and confidence intervals, with simulated datasets provided for practical demonstration.
Maintained by Aryan Rezanezhad. Last updated 6 months ago.
9.6 match 3.71 score 16 scriptschrhennig
fpc:Flexible Procedures for Clustering
Various methods for clustering and cluster validation. Fixed point clustering. Linear regression clustering. Clustering by merging Gaussian mixture components. Symmetric and asymmetric discriminant projections for visualisation of the separation of groupings. Cluster validation statistics for distance based clustering including corrected Rand index. Standardisation of cluster validation statistics by random clusterings and comparison between many clustering methods and numbers of clusters based on this. Cluster-wise cluster stability assessment. Methods for estimation of the number of clusters: Calinski-Harabasz, Tibshirani and Walther's prediction strength, Fang and Wang's bootstrap stability. Gaussian/multinomial mixture fitting for mixed continuous/categorical variables. Variable-wise statistics for cluster interpretation. DBSCAN clustering. Interface functions for many clustering methods implemented in R, including estimating the number of clusters with kmeans, pam and clara. Modality diagnosis for Gaussian mixtures. For an overview see package?fpc.
Maintained by Christian Hennig. Last updated 6 months ago.
3.8 match 11 stars 9.25 score 2.6k scripts 70 dependentsejikeugba
gofcat:Goodness-of-Fit Measures for Categorical Response Models
A post-estimation method for categorical response models (CRM). Inputs from objects of class serp(), clm(), polr(), multinom(), mlogit(), vglm() and glm() are currently supported. Available tests include the Hosmer-Lemeshow tests for the binary, multinomial and ordinal logistic regression; the Lipsitz and the Pulkstenis-Robinson tests for the ordinal models. The proportional odds, adjacent-category, and constrained continuation-ratio models are particularly supported at ordinal level. Tests for the proportional odds assumptions in ordinal models are also possible with the Brant and the Likelihood-Ratio tests. Moreover, several summary measures of predictive strength (Pseudo R-squared), and some useful error metrics, including, the brier score, misclassification rate and logloss are also available for the binary, multinomial and ordinal models. Ugba, E. R. and Gertheiss, J. (2018) <http://www.statmod.org/workshops_archive_proceedings_2018.html>.
Maintained by Ejike R. Ugba. Last updated 2 years ago.
brant-testbrier-scoreshosmer-lemeshow-testlikelihood-ratio-testlipsitz-testlog-loss-score-metriclogistic-regressionmisclassificationordinal-regressionproportional-odds-testpseudo-r2pulkstenis-robinson-test
10.5 match 2 stars 3.18 score 15 scriptsformidify
BayesSenMC:Different Models of Posterior Distributions of Adjusted Odds Ratio
Generates different posterior distributions of adjusted odds ratio under different priors of sensitivity and specificity, and plots the models for comparison. It also provides estimations for the specifications of the models using diagnostics of exposure status with a non-linear mixed effects model. It implements the methods that are first proposed in <doi:10.1016/j.annepidem.2006.04.001> and <doi:10.1177/0272989X09353452>.
Maintained by Jinhui Yang. Last updated 4 years ago.
11.3 match 2.70 scoremblumuga
abc:Tools for Approximate Bayesian Computation (ABC)
Implements several ABC algorithms for performing parameter estimation, model selection, and goodness-of-fit. Cross-validation tools are also available for measuring the accuracy of ABC estimates, and to calculate the misclassification probabilities of different models.
Maintained by Blum Michael. Last updated 3 months ago.
4.3 match 1 stars 6.93 score 410 scripts 9 dependentsvalentint
robust:Port of the S+ "Robust Library"
Methods for robust statistics, a state of the art in the early 2000s, notably for robust regression and robust multivariate analysis.
Maintained by Valentin Todorov. Last updated 7 months ago.
3.8 match 7.52 score 572 scripts 8 dependentspsobczyk
varclust:Variables Clustering
Performs clustering of quantitative variables, assuming that clusters lie in low-dimensional subspaces. Segmentation of variables, number of clusters and their dimensions are selected based on BIC. Candidate models are identified based on many runs of K-means algorithm with different random initializations of cluster centers.
Maintained by Piotr Sobczyk. Last updated 4 years ago.
5.0 match 3 stars 4.32 score 14 scriptsthie1e
cutpointr:Determine and Evaluate Optimal Cutpoints in Binary Classification Tasks
Estimate cutpoints that optimize a specified metric in binary classification tasks and validate performance using bootstrapping. Some methods for more robust cutpoint estimation are supported, e.g. a parametric method assuming normal distributions, bootstrapped cutpoints, and smoothing of the metric values per cutpoint using Generalized Additive Models. Various plotting functions are included. For an overview of the package see Thiele and Hirschfeld (2021) <doi:10.18637/jss.v098.i11>.
Maintained by Christian Thiele. Last updated 3 months ago.
bootstrappingcutpoint-optimizationroc-curvecpp
2.0 match 88 stars 10.44 score 322 scripts 1 dependentsjingxuanh
xtune:Regularized Regression with Feature-Specific Penalties Integrating External Information
Extends standard penalized regression (Lasso, Ridge, and Elastic-net) to allow feature-specific shrinkage based on external information with the goal of achieving a better prediction accuracy and variable selection. Examples of external information include the grouping of predictors, prior knowledge of biological importance, external p-values, function annotations, etc. The choice of multiple tuning parameters is done using an Empirical Bayes approach. A majorization-minimization algorithm is employed for implementation.
Maintained by Jingxuan He. Last updated 2 years ago.
5.0 match 3.90 score 16 scriptssantagos
dad:Three-Way / Multigroup Data Analysis Through Densities
The data consist of a set of variables measured on several groups of individuals. To each group is associated an estimated probability density function. The package provides tools to create or manage such data and functional methods (principal component analysis, multidimensional scaling, cluster analysis, discriminant analysis...) for such probability densities.
Maintained by Pierre Santagostini. Last updated 4 months ago.
3.4 match 5.33 score 92 scriptsjohnnyzhz
logistic4p:Logistic Regression with Misclassification in Dependent Variables
Error in a binary dependent variable, also known as misclassification, has not drawn much attention in psychology. Ignoring misclassification in logistic regression can result in misleading parameter estimates and statistical inference. This package conducts logistic regression analysis with misspecification in outcome variables.
Maintained by Zhiyong Zhang. Last updated 1 years ago.
16.3 match 1.00 score 8 scriptschjackson
msmbayes:Bayesian Multi-State Models for Intermittently-Observed Data
Bayesian multi-state models for intermittently-observed data. Markov and phase-type semi-Markov models, and misclassification hidden Markov models.
Maintained by Christopher Jackson. Last updated 4 months ago.
3.7 match 4 stars 4.26 score 3 scriptssvkucheryavski
mdatools:Multivariate Data Analysis for Chemometrics
Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.
Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.
2.0 match 35 stars 7.37 score 220 scripts 1 dependentscran
SAMBA:Selection and Misclassification Bias Adjustment for Logistic Regression Models
Health research using data from electronic health records (EHR) has gained popularity, but misclassification of EHR-derived disease status and lack of representativeness of the study sample can result in substantial bias in effect estimates and can impact power and type I error for association tests. Here, the assumed target of inference is the relationship between binary disease status and predictors modeled using a logistic regression model. 'SAMBA' implements several methods for obtaining bias-corrected point estimates along with valid standard errors as proposed in Beesley and Mukherjee (2020) <doi:10.1101/2019.12.26.19015859>, currently under review.
Maintained by Alexander Rix. Last updated 5 years ago.
3.4 match 4.18 score 1 dependentsbioc
MiPP:Misclassification Penalized Posterior Classification
This package finds optimal sets of genes that seperate samples into two or more classes.
Maintained by Sukwoo Kim. Last updated 5 months ago.
3.1 match 3.60 score 1 scriptsandrewtitman
nhm:Non-Homogeneous Markov and Hidden Markov Multistate Models
Fits non-homogeneous Markov multistate models and misclassification-type hidden Markov models in continuous time to intermittently observed data. Implements the methods in Titman (2011) <doi:10.1111/j.1541-0420.2010.01550.x>. Uses direct numerical solution of the Kolmogorov forward equations to calculate the transition probabilities.
Maintained by Andrew Titman. Last updated 1 years ago.
5.6 match 1 stars 2.00 score 4 scriptsmodal-inria
RMixtCompUtilities:Utility Functions for 'MixtComp' Outputs
Mixture Composer <https://github.com/modal-inria/MixtComp> is a project to build mixture models with heterogeneous data sets and partially missing data management. This package contains graphical, getter and some utility functions to facilitate the analysis of 'MixtComp' output.
Maintained by Quentin Grimonprez. Last updated 10 months ago.
clusteringcppheterogeneous-datamissing-datamixed-datamixture-modelstatistics
2.0 match 13 stars 5.19 score 2 scripts 1 dependentscran
tree:Classification and Regression Trees
Classification and regression trees.
Maintained by Brian Ripley. Last updated 3 months ago.
2.0 match 1 stars 4.76 score 13 dependentsmoran79
folda:Forward Stepwise Discriminant Analysis with Pillai's Trace
A novel forward stepwise discriminant analysis framework that integrates Pillai's trace with Uncorrelated Linear Discriminant Analysis (ULDA), providing an improvement over traditional stepwise LDA methods that rely on Wilks' Lambda. A stand-alone ULDA implementation is also provided, offering a more general solution than the one available in the 'MASS' package. It automatically handles missing values and provides visualization tools. For more details, see Wang (2024) <doi:10.48550/arXiv.2409.03136>.
Maintained by Siyu Wang. Last updated 5 months ago.
1.8 match 2 stars 5.18 score 6 scripts 1 dependentsjaredhuling
oem:Orthogonalizing EM: Penalized Regression for Big Tall Data
Solves penalized least squares problems for big tall data using the orthogonalizing EM algorithm of Xiong et al. (2016) <doi:10.1080/00401706.2015.1054436>. The main fitting function is oem() and the functions cv.oem() and xval.oem() are for cross validation, the latter being an accelerated cross validation function for linear models. The big.oem() function allows for out of memory fitting. A description of the underlying methods and code interface is described in Huling and Chien (2022) <doi:10.18637/jss.v104.i06>.
Maintained by Jared Huling. Last updated 8 months ago.
group-lassolassomachine-learningmcpoemoem-algorithmpenalized-regressionscadvariable-selectionopenblascppopenmp
1.5 match 27 stars 6.02 score 26 scripts 1 dependentsphilipppro
measures:Performance Measures for Statistical Learning
Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.
Maintained by Philipp Probst. Last updated 4 years ago.
2.0 match 1 stars 4.47 score 88 scripts 2 dependentslindanab
mecor:Measurement Error Correction in Linear Models with a Continuous Outcome
Covariate measurement error correction is implemented by means of regression calibration by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331), efficient regression calibration by Spiegelman D, Carroll RJ & Kipnis V (2001) <doi:10.1002/1097-0258(20010115)20:1%3C139::AID-SIM644%3E3.0.CO;2-K> and maximum likelihood estimation by Bartlett JW, Stavola DBL & Frost C (2009) <doi:10.1002/sim.3713>. Outcome measurement error correction is implemented by means of the method of moments by Buonaccorsi JP (2010, ISBN:1420066560) and efficient method of moments by Keogh RH, Carroll RJ, Tooze JA, Kirkpatrick SI & Freedman LS (2014) <doi:10.1002/sim.7011>. Standard error estimation of the corrected estimators is implemented by means of the Delta method by Rosner B, Spiegelman D & Willett WC (1990) <doi:10.1093/oxfordjournals.aje.a115715> and Rosner B, Spiegelman D & Willett WC (1992) <doi:10.1093/oxfordjournals.aje.a116453>, the Fieller method described by Buonaccorsi JP (2010, ISBN:1420066560), and the Bootstrap by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331).
Maintained by Linda Nab. Last updated 3 years ago.
linear-modelsmeasurement-errorstatistics
1.8 match 6 stars 5.07 score 13 scriptschristinaheinze
CondIndTests:Nonlinear Conditional Independence Tests
Code for a variety of nonlinear conditional independence tests: Kernel conditional independence test (Zhang et al., UAI 2011, <arXiv:1202.3775>), Residual Prediction test (based on Shah and Buehlmann, <arXiv:1511.03334>), Invariant environment prediction, Invariant target prediction, Invariant residual distribution test, Invariant conditional quantile prediction (all from Heinze-Deml et al., <arXiv:1706.08576>).
Maintained by Christina Heinze-Deml. Last updated 5 years ago.
1.8 match 17 stars 4.91 score 32 scripts 1 dependentscran
hmeasure:The H-Measure and Other Scalar Classification Performance Metrics
Classification performance metrics that are derived from the ROC curve of a classifier. The package includes the H-measure performance metric as described in <http://link.springer.com/article/10.1007/s10994-009-5119-5>, which computes the minimum total misclassification cost, integrating over any uncertainty about the relative misclassification costs, as per a user-defined prior. It also offers a one-stop-shop for other scalar metrics of performance, including sensitivity, specificity and many others, and also offers plotting tools for ROC curves and related statistics.
Maintained by Christoforos Anagnostopoulos. Last updated 6 years ago.
2.4 match 3.48 score 1 dependentso1iv3r
FeatureImpCluster:Feature Importance for Partitional Clustering
Implements a novel approach for measuring feature importance in k-means clustering. Importance of a feature is measured by the misclassification rate relative to the baseline cluster assignment due to a random permutation of feature values. An explanation of permutation feature importance in general can be found here: <https://christophm.github.io/interpretable-ml-book/feature-importance.html>.
Maintained by Oliver Pfaffel. Last updated 3 years ago.
2.3 match 4 stars 3.58 score 19 scriptsmetabocomp
MUVR2:Multivariate Methods with Unbiased Variable Selection
Predictive multivariate modelling for metabolomics. Types: Classification and regression. Methods: Partial Least Squares, Random Forest ans Elastic Net Data structures: Paired and unpaired Validation: repeated double cross-validation (Westerhuis et al. (2008)<doi:10.1007/s11306-007-0099-6>, Filzmoser et al. (2009)<doi:10.1002/cem.1225>) Variable selection: Performed internally, through tuning in the inner cross-validation loop.
Maintained by Yingxiao Yan. Last updated 6 months ago.
2.0 match 2 stars 4.04 score 1 scriptsmrmarjan
mri:Modified Rand and Wallace Indices
It provides functions to compute the values of different modifications of the Rand and Wallace indices. The indices are used to measure the stability or similarity of two partitions obtained on two different sets of units with a non-empty intercept. Splitting and merging of clusters can (depends on the selected index) have a different effect on the value of the indices. The indices are proposed in Cugmas and Ferligoj (2018) <http://ibmi.mf.uni-lj.si/mz/2018/no-1/Cugmas2018.pdf>.
Maintained by Marjan Cugmas. Last updated 6 years ago.
4.0 match 2.00 score 4 scriptscran
rpartScore:Classification Trees for Ordinal Responses
Recursive partitioning methods to build classification trees for ordinal responses within the CART framework. Trees are grown using the Generalized Gini impurity function, where the misclassification costs are given by the absolute or squared differences in scores assigned to the categories of the response. Pruning is based on the total misclassification rate or on the total misclassification cost.
Maintained by Giuliano Galimberti. Last updated 3 years ago.
4.4 match 1.76 score 19 scripts 1 dependentsvsousa
poolABC:Approximate Bayesian Computation with Pooled Sequencing Data
Provides functions to simulate Pool-seq data under models of demographic formation and to import Pool-seq data from real populations. Implements two ABC algorithms for performing parameter estimation and model selection using Pool-seq data. Cross-validation can also be performed to assess the accuracy of ABC estimates and model choice. Carvalho et al., (2022) <doi:10.1111/1755-0998.13834>.
Maintained by João Carvalho. Last updated 2 years ago.
2.0 match 1 stars 3.70 score 3 scriptsnutriverse
sleacr:Simplified Lot Quality Assurance Sampling Evaluation of Access and Coverage (SLEAC) Tools
In the recent past, measurement of coverage has been mainly through two-stage cluster sampled surveys either as part of a nutrition assessment or through a specific coverage survey known as Centric Systematic Area Sampling (CSAS). However, such methods are resource intensive and often only used for final programme evaluation meaning results arrive too late for programme adaptation. SLEAC, which stands for Simplified Lot Quality Assurance Sampling Evaluation of Access and Coverage, is a low resource method designed specifically to address this limitation and is used regularly for monitoring, planning and importantly, timely improvement to programme quality, both for agency and Ministry of Health (MoH) led programmes. SLEAC is designed to complement the Semi-quantitative Evaluation of Access and Coverage (SQUEAC) method. This package provides functions for use in conducting a SLEAC assessment.
Maintained by Ernest Guevarra. Last updated 1 months ago.
acute-malnutritioncmamcoveragenutritionsleacwasting
2.0 match 1 stars 3.48 score 5 scriptscran
augSIMEX:Analysis of Data with Mixed Measurement Error and Misclassification in Covariates
Implementation of the augmented Simulation-Extrapolation (SIMEX) algorithm proposed by Yi et al. (2015) <doi:10.1080/01621459.2014.922777> for analyzing the data with mixed measurement error and misclassification. The main function provides a similar summary output as that of glm() function. Both parametric and empirical SIMEX are considered in the package.
Maintained by Qihuang Zhang. Last updated 5 years ago.
6.8 match 1.00 scorekoendebock
CustomerScoringMetrics:Evaluation Metrics for Customer Scoring Models Depending on Binary Classifiers
Functions for evaluating and visualizing predictive model performance (specifically: binary classifiers) in the field of customer scoring. These metrics include lift, lift index, gain percentage, top-decile lift, F1-score, expected misclassification cost and absolute misclassification cost. See Berry & Linoff (2004, ISBN:0-471-47064-3), Witten and Frank (2005, 0-12-088407-0) and Blattberg, Kim & Neslin (2008, ISBN:978–0–387–72578–9) for details. Visualization functions are included for lift charts and gain percentage charts. All metrics that require class predictions offer the possibility to dynamically determine cutoff values for transforming real-valued probability predictions into class predictions.
Maintained by Koen W. De Bock. Last updated 7 years ago.
4.6 match 1.40 score 25 scriptslarskotthoff
llama:Leveraging Learning to Automatically Manage Algorithms
Provides functionality to train and evaluate algorithm selection models for portfolios.
Maintained by Lars Kotthoff. Last updated 4 years ago.
2.3 match 4 stars 2.80 score 53 scripts 1 dependentsskranz
RoundingMatters:Tools for adjusting for rounding problems in metastudies about p-hacking and publication bias
Tools for adjusting for rounding problems in metastudies about p-hacking and publication bias
Maintained by Sebastian Kranz. Last updated 4 years ago.
3.2 match 1.70 score 8 scriptscran
MLDS:Maximum Likelihood Difference Scaling
Difference scaling is a method for scaling perceived supra-threshold differences. The package contains functions that allow the user to design and run a difference scaling experiment, to fit the resulting data by maximum likelihood and test the internal validity of the estimated scale.
Maintained by Kenneth Knoblauch. Last updated 2 years ago.
1.8 match 2.70 scoremichlau
logicDT:Identifying Interactions Between Binary Predictors
A statistical learning method that tries to find the best set of predictors and interactions between predictors for modeling binary or quantitative response data in a decision tree. Several search algorithms and ensembling techniques are implemented allowing for finetuning the method to the specific problem. Interactions with quantitative covariables can be properly taken into account by fitting local regression models. Moreover, a variable importance measure for assessing marginal and interaction effects is provided. Implements the procedures proposed by Lau et al. (2024, <doi:10.1007/s10994-023-06488-6>).
Maintained by Michael Lau. Last updated 6 months ago.
2.0 match 2 stars 2.00 score 2 scriptscran
noisemodel:Noise Models for Classification Datasets
Implementation of models for the controlled introduction of errors in classification datasets. This package contains the noise models described in Saez (2022) <doi:10.3390/math10203736> that allow corrupting class labels, attributes and both simultaneously.
Maintained by José A. Sáez. Last updated 2 years ago.
1.9 match 2.00 scoremkhondoker
optBiomarker:Estimation of Optimal Number of Biomarkers for Two-Group Microarray Based Classifications at a Given Error Tolerance Level for Various Classification Rules
Estimates optimal number of biomarkers for two-group classification based on microarray data.
Maintained by Mizanur Khondoker. Last updated 4 years ago.
1.6 match 2.00 score 1 scriptsxzhu20
ManlyMix:Manly Mixture Modeling and Model-Based Clustering
The utility of this package includes finite mixture modeling and model-based clustering through Manly mixture models by Zhu and Melnykov (2016) <DOI:10.1016/j.csda.2016.01.015>. It also provides capabilities for forward and backward model selection procedures.
Maintained by Xuwen Zhu. Last updated 6 months ago.
1.8 match 1.65 score 15 scripts 1 dependentsbenjilu
forestError:A Unified Framework for Random Forest Prediction Error Estimation
Estimates the conditional error distributions of random forest predictions and common parameters of those distributions, including conditional misclassification rates, conditional mean squared prediction errors, conditional biases, and conditional quantiles, by out-of-bag weighting of out-of-bag prediction errors as proposed by Lu and Hardin (2021). This package is compatible with several existing packages that implement random forests in R.
Maintained by Benjamin Lu. Last updated 4 years ago.
inferenceintervalsmachine-learningmachinelearningpredictionrandom-forestrandomforeststatistics
0.5 match 26 stars 4.62 score 16 scriptssnoweye
MixSim:Simulating Data to Study Performance of Clustering Algorithms
The utility of this package is in simulating mixtures of Gaussian distributions with different levels of overlap between mixture components. Pairwise overlap, defined as a sum of two misclassification probabilities, measures the degree of interaction between components and can be readily employed to control the clustering complexity of datasets simulated from mixtures. These datasets can then be used for systematic performance investigation of clustering and finite mixture modeling algorithms. Among other capabilities of 'MixSim', there are computing the exact overlap for Gaussian mixtures, simulating Gaussian and non-Gaussian data, simulating outliers and noise variables, calculating various measures of agreement between two partitionings, and constructing parallel distribution plots for the graphical display of finite mixture models.
Maintained by Wei-Chen Chen. Last updated 8 months ago.
0.5 match 1 stars 4.48 score 84 scripts 3 dependentsviroli
quantileDA:Quantile Classifier
Code for centroid, median and quantile classifiers.
Maintained by Cinzia Viroli. Last updated 12 months ago.
2.3 match 1.00 score 10 scriptsdhaine
apisensr:Interface to 'episensr' for Sensitivity Analysis of Epidemiological Results
API for using 'episensr', Basic sensitivity analysis of the observed relative risks adjusting for unmeasured confounding and misclassification of the exposure/outcome, or both. See <https://cran.r-project.org/package=episensr>.
Maintained by Denis Haine. Last updated 2 years ago.
0.5 match 3 stars 4.18 score 5 scriptsgjjvdburg
gensvm:A Generalized Multiclass Support Vector Machine
The GenSVM classifier is a generalized multiclass support vector machine (SVM). This classifier aims to find decision boundaries that separate the classes with as wide a margin as possible. In GenSVM, the loss function is very flexible in the way that misclassifications are penalized. This allows the user to tune the classifier to the dataset at hand and potentially obtain higher classification accuracy than alternative multiclass SVMs. Moreover, this flexibility means that GenSVM has a number of other multiclass SVMs as special cases. One of the other advantages of GenSVM is that it is trained in the primal space, allowing the use of warm starts during optimization. This means that for common tasks such as cross validation or repeated model fitting, GenSVM can be trained very quickly. Based on: G.J.J. van den Burg and P.J.F. Groenen (2018) <https://www.jmlr.org/papers/v17/14-526.html>.
Maintained by Gertjan van den Burg. Last updated 2 years ago.
classificationmachine-learningmachine-learning-algorithmsmulticlass-classificationsupport-vector-machine
0.5 match 7 stars 3.96 score 26 scriptsashipunov
shipunov:Miscellaneous Functions from Alexey Shipunov
A collection of functions for data manipulation, plotting and statistical computing, to use separately or with the book "Visual Statistics. Use R!": Shipunov (2020) <http://ashipunov.info/shipunov/software/r/r-en.htm>. Dr Alexey Shipunov died in December 2022. Most useful functions: Bclust(), Jclust() and BootA() which bootstrap hierarchical clustering; Recode() which does multiple recoding in a fast, simple and flexible way; Misclass() which outputs confusion matrix even if classes are not concerted; Overlap() which measures group separation on any projection; Biarrows() which converts any scatterplot into biplot; and Pleiad() which is fast and flexible correlogram.
Maintained by ORPHANED. Last updated 2 years ago.
2.0 match 1.00 score 9 scriptssdlugosz
misclassGLM:Computation of Generalized Linear Models with Misclassified Covariates Using Side Information
Estimates models that extend the standard GLM to take misclassification into account. The models require side information from a secondary data set on the misclassification process, i.e. some sort of misclassification probabilities conditional on some common covariates. A detailed description of the algorithm can be found in Dlugosz, Mammen and Wilke (2015) <https://www.zew.de/publikationen/generalised-partially-linear-regression-with-misclassified-data-and-an-application-to-labour-market-transitions>.
Maintained by Stephan Dlugosz. Last updated 1 years ago.
0.9 match 1.81 score 13 scriptsnicolas-schmidt
BayesMFSurv:Bayesian Misclassified-Failure Survival Model
Contains a split population survival estimator that models the misclassification probability of failure versus right-censored events. The split population survival estimator is described in Bagozzi et al. (2019) <doi:10.1017/pan.2019.6>.
Maintained by Nicolas Schmidt. Last updated 5 years ago.
misclassified-failure-estimatessurvivalcpp
0.5 match 1 stars 3.00 scoredkahle
poisDoubleSamp:Confidence Intervals with Poisson Double Sampling
Functions to create confidence intervals for ratios of Poisson rates under misclassification using double sampling.
Maintained by David Kahle. Last updated 10 years ago.
0.5 match 1 stars 2.00 score 7 scriptsandrewtitman
miscIC:Misclassified Interval Censored Time-to-Event Data
Estimation of the survivor function for interval censored time-to-event data subject to misclassification using nonparametric maximum likelihood estimation, implementing the methods of Titman (2017) <doi:10.1007/s11222-016-9705-7>. Misclassification probabilities can either be specified as fixed or estimated. Models with time dependent misclassification may also be fitted.
Maintained by Andrew Titman. Last updated 5 years ago.
0.9 match 1.00 scoreyuliangxu
mgee2:Marginal Analysis of Misclassified Longitudinal Ordinal Data
Three estimating equation methods are provided in this package for marginal analysis of longitudinal ordinal data with misclassified responses and covariates. The naive analysis which is solely based on the observed data without adjustment may lead to bias. The corrected generalized estimating equations (GEE2) method which is unbiased requires the misclassification parameters to be known beforehand. The corrected generalized estimating equations (GEE2) with validation subsample method estimates the misclassification parameters based on a given validation set. This package is an implementation of Chen (2013) <doi:10.1002/bimj.201200195>.
Maintained by Yuliang Xu. Last updated 4 months ago.
0.8 match 1.00 score 3 scriptsmeintraumus
AFFECT:Accelerated Functional Failure Time Model with Error-Contaminated Survival Times
We aim to deal with data with measurement error in the response and misclassification censoring status under an AFT model. This package primarily contains three functions, which are used to generate artificial data, correction for error-prone data and estimate the functional covariates for an AFT model.
Maintained by Hsiao-Ting Huang. Last updated 2 years ago.
0.5 match 1.00 scoreshu-d
ipwErrorY:Inverse Probability Weighted Estimation of Average Treatment Effect with Misclassified Binary Outcome
An implementation of the correction methods proposed by Shu and Yi (2017) <doi:10.1177/0962280217743777> for the inverse probability weighted (IPW) estimation of average treatment effect (ATE) with misclassified binary outcomes. Logistic regression model is assumed for treatment model for all implemented correction methods, and is assumed for the outcome model for the implemented doubly robust correction method. Misclassification probability given a true value of the outcome is assumed to be the same for all individuals.
Maintained by Di Shu. Last updated 6 years ago.
0.5 match 1.00 score 4 scriptscran
abcrlda:Asymptotically Bias-Corrected Regularized Linear Discriminant Analysis
Offers methods to perform asymptotically bias-corrected regularized linear discriminant analysis (ABC_RLDA) for cost-sensitive binary classification. The bias-correction is an estimate of the bias term added to regularized discriminant analysis (RLDA) that minimizes the overall risk. The default magnitude of misclassification costs are equal and set to 0.5; however, the package also offers the options to set them to some predetermined values or, alternatively, take them as hyperparameters to tune. A. Zollanvari, M. Abdirash, A. Dadlani and B. Abibullaev (2019) <doi:10.1109/LSP.2019.2918485>.
Maintained by Dmitriy Fedorov. Last updated 5 years ago.
0.5 match 1.00 score