Showing 11 of total 11 results (show query)
modeloriented
DALEX:moDel Agnostic Language for Exploration and eXplanation
Any unverified black box model is the path to failure. Opaqueness leads to distrust. Distrust leads to ignoration. Ignoration leads to rejection. DALEX package xrays any model and helps to explore and explain its behaviour. Machine Learning (ML) models are widely used and have various applications in classification or regression. Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance. But such black-box models usually lack direct interpretability. DALEX package contains various methods that help to understand the link between input variables and model output. Implemented methods help to explore the model on the level of a single instance as well as a level of the whole dataset. All model explainers are model agnostic and can be compared across different models. DALEX package is the cornerstone for 'DrWhy.AI' universe of packages for visual model exploration. Find more details in (Biecek 2018) <https://jmlr.org/papers/v19/18-416.html>.
Maintained by Przemyslaw Biecek. Last updated 2 months ago.
black-boxdalexdata-scienceexplainable-aiexplainable-artificial-intelligenceexplainable-mlexplanationsexplanatory-model-analysisfairnessimlinterpretabilityinterpretable-machine-learningmachine-learningmodel-visualizationpredictive-modelingresponsible-airesponsible-mlxai
1.4k stars 13.40 score 876 scripts 21 dependentshannameyer
CAST:'caret' Applications for Spatial-Temporal Models
Supporting functionality to run 'caret' with spatial or spatial-temporal data. 'caret' is a frequently used package for model training and prediction using machine learning. CAST includes functions to improve spatial or spatial-temporal modelling tasks using 'caret'. It includes the newly suggested 'Nearest neighbor distance matching' cross-validation to estimate the performance of spatial prediction models and allows for spatial variable selection to selects suitable predictor variables in view to their contribution to the spatial model performance. CAST further includes functionality to estimate the (spatial) area of applicability of prediction models. Methods are described in Meyer et al. (2018) <doi:10.1016/j.envsoft.2017.12.001>; Meyer et al. (2019) <doi:10.1016/j.ecolmodel.2019.108815>; Meyer and Pebesma (2021) <doi:10.1111/2041-210X.13650>; Milà et al. (2022) <doi:10.1111/2041-210X.13851>; Meyer and Pebesma (2022) <doi:10.1038/s41467-022-29838-9>; Linnenbrink et al. (2023) <doi:10.5194/egusphere-2023-1308>; Schumacher et al. (2024) <doi:10.5194/egusphere-2024-2730>. The package is described in detail in Meyer et al. (2024) <doi:10.48550/arXiv.2404.06978>.
Maintained by Hanna Meyer. Last updated 2 months ago.
autocorrelationcaretfeature-selectionmachine-learningoverfittingpredictive-modelingspatialspatio-temporalvariable-selection
114 stars 11.85 score 298 scripts 1 dependentsmerliseclyde
BAS:Bayesian Variable Selection and Model Averaging using Bayesian Adaptive Sampling
Package for Bayesian Variable Selection and Model Averaging in linear models and generalized linear models using stochastic or deterministic sampling without replacement from posterior distributions. Prior distributions on coefficients are from Zellner's g-prior or mixtures of g-priors corresponding to the Zellner-Siow Cauchy Priors or the mixture of g-priors from Liang et al (2008) <DOI:10.1198/016214507000001337> for linear models or mixtures of g-priors from Li and Clyde (2019) <DOI:10.1080/01621459.2018.1469992> in generalized linear models. Other model selection criteria include AIC, BIC and Empirical Bayes estimates of g. Sampling probabilities may be updated based on the sampled models using sampling w/out replacement or an efficient MCMC algorithm which samples models using a tree structure of the model space as an efficient hash table. See Clyde, Ghosh and Littman (2010) <DOI:10.1198/jcgs.2010.09049> for details on the sampling algorithms. Uniform priors over all models or beta-binomial prior distributions on model size are allowed, and for large p truncated priors on the model space may be used to enforce sampling models that are full rank. The user may force variables to always be included in addition to imposing constraints that higher order interactions are included only if their parents are included in the model. This material is based upon work supported by the National Science Foundation under Division of Mathematical Sciences grant 1106891. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Maintained by Merlise Clyde. Last updated 4 months ago.
bayesianbayesian-inferencegeneralized-linear-modelslinear-regressionlogistic-regressionmcmcmodel-selectionpoisson-regressionpredictive-modelingregressionvariable-selectionfortranopenblas
44 stars 10.63 score 420 scripts 3 dependentslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 1 months ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
233 stars 9.92 score 185 scripts 1 dependentsbrian-j-smith
MachineShop:Machine Learning Models and Tools
Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
Maintained by Brian J Smith. Last updated 7 months ago.
classification-modelsmachine-learningpredictive-modelingregression-modelssurvival-models
62 stars 7.95 score 121 scriptssmaakage85
modelgrid:A Framework for Creating, Managing and Training Multiple Caret Models
A minimalistic but flexible framework that facilitates the creation, management and training of multiple 'caret' models. A model grid consists of two components: (1) a set of settings that is shared by all models by default, and (2) specifications that apply only to the individual models. When the model grid is trained, model and training specifications are first consolidated from the shared and the model specific settings into complete 'caret' model configurations. These models are then trained with the 'train' function from the 'caret' package.
Maintained by Lars Kjeldgaard. Last updated 6 years ago.
caretmachine-learningpredictive-analyticspredictive-modeling
23 stars 5.34 score 19 scriptssmaakage85
recorder:Toolkit to Validate New Data for a Predictive Model
A lightweight toolkit to validate new observations when computing their predictions with a predictive model. The validation process consists of two steps: (1) record relevant statistics and meta data of the variables in the original training data for the predictive model and (2) use these data to run a set of basic validation tests on the new set of observations.
Maintained by Lars Kjeldgaard. Last updated 6 years ago.
data-analysismachine-learningpredictive-analyticspredictive-modeling
4 stars 4.78 score 6 scriptstechtonique
ahead:Time Series Forecasting with uncertainty quantification
Univariate and multivariate time series forecasting with uncertainty quantification.
Maintained by T. Moudiki. Last updated 1 months ago.
forecastingmachine-learningpredictive-modelingstatistical-learningtime-seriestime-series-forecastinguncertainty-quantificationcpp
21 stars 4.63 score 51 scriptsmodeloriented
drifter:Concept Drift and Concept Shift Detection for Predictive Models
Concept drift refers to the change in the data distribution or in the relationships between variables over time. 'drifter' calculates distances between variable distributions or variable relations and identifies both types of drift. Key functions are: calculate_covariate_drift() checks distance between corresponding variables in two datasets, calculate_residuals_drift() checks distance between residual distributions for two models, calculate_model_drift() checks distance between partial dependency profiles for two models, check_drift() executes all checks against drift. 'drifter' is a part of the 'DrWhy.AI' universe (Biecek 2018) <arXiv:1806.08915>.
Maintained by Przemyslaw Biecek. Last updated 6 years ago.
concept-driftconcept-shiftdrwhypredictive-modeling
19 stars 4.45 score 6 scripts 1 dependentsajayarunachalam
metaEnsembleR:Intuitive Package for Meta-Ensemble Learning (Classification, Regression) that is Fully-Automated
This package significantly lowers the barrier for the practitioners to apply heterogeneous ensemble learning techniques in an amateur fashion to their everyday predictive problems.
Maintained by Ajay Arunachalam. Last updated 4 years ago.
automationclassificationensemble-machine-learningensemble-methodsgeneralization-errorheterogenousmeta-learningpredictive-modelingregressionstacked-ensembles
3.70 scorefrahik
IBCF.MTME:Item Based Collaborative Filtering for Multi-Trait and Multi-Environment Data
Implements the item based collaborative filtering (IBCF) method for continues phenotypes in the context of plant breeding where data are collected for various traits that were studied in various environments proposed by Montesinos-López et al. (2017) <doi:10.1534/g3.117.300309>.
Maintained by Francisco Javier Luna-Vazquez. Last updated 6 years ago.
bayesian-methodspredictive-modeling
2 stars 3.28 score 19 scripts