Showing 150 of total 150 results (show query)
modeloriented
DALEX:moDel Agnostic Language for Exploration and eXplanation
Any unverified black box model is the path to failure. Opaqueness leads to distrust. Distrust leads to ignoration. Ignoration leads to rejection. DALEX package xrays any model and helps to explore and explain its behaviour. Machine Learning (ML) models are widely used and have various applications in classification or regression. Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance. But such black-box models usually lack direct interpretability. DALEX package contains various methods that help to understand the link between input variables and model output. Implemented methods help to explore the model on the level of a single instance as well as a level of the whole dataset. All model explainers are model agnostic and can be compared across different models. DALEX package is the cornerstone for 'DrWhy.AI' universe of packages for visual model exploration. Find more details in (Biecek 2018) <https://jmlr.org/papers/v19/18-416.html>.
Maintained by Przemyslaw Biecek. Last updated 1 months ago.
black-boxdalexdata-scienceexplainable-aiexplainable-artificial-intelligenceexplainable-mlexplanationsexplanatory-model-analysisfairnessimlinterpretabilityinterpretable-machine-learningmachine-learningmodel-visualizationpredictive-modelingresponsible-airesponsible-mlxai
25.8 match 1.4k stars 13.40 score 876 scripts 21 dependentsdandls
counterfactuals:Counterfactual Explanations
Modular and unified R6-based interface for counterfactual explanation methods. The following methods are currently implemented: Burghmans et al. (2022) <doi:10.48550/arXiv.2104.07411>, Dandl et al. (2020) <doi:10.1007/978-3-030-58112-1_31> and Wexler et al. (2019) <doi:10.1109/TVCG.2019.2934619>. Optional extensions allow these methods to be applied to a variety of models and use cases. Once generated, the counterfactuals can be analyzed and visualized by provided functionalities.
Maintained by Susanne Dandl. Last updated 5 months ago.
interpretable-machine-learninglocal-explanationsmodel-agnostic-explanations
30.5 match 21 stars 7.14 score 22 scriptsthomasp85
lime:Local Interpretable Model-Agnostic Explanations
When building complex models, it is often difficult to explain why the model should be trusted. While global measures such as accuracy are useful, they cannot be used for explaining why a model made a specific prediction. 'lime' (a port of the 'lime' 'Python' package) is a method for explaining the outcome of black box models by fitting a local model around the point in question an perturbations of this point. The approach is described in more detail in the article by Ribeiro et al. (2016) <arXiv:1602.04938>.
Maintained by Emil Hvitfeldt. Last updated 3 years ago.
caretmodel-checkingmodel-evaluationmodelingcpp
17.6 match 485 stars 11.07 score 732 scripts 1 dependentsmodeloriented
survex:Explainable Machine Learning in Survival Analysis
Survival analysis models are commonly used in medicine and other areas. Many of them are too complex to be interpreted by human. Exploration and explanation is needed, but standard methods do not give a broad enough picture. 'survex' provides easy-to-apply methods for explaining survival models, both complex black-boxes and simpler statistical models. They include methods specific to survival analysis such as SurvSHAP(t) introduced in Krzyzinski et al., (2023) <doi:10.1016/j.knosys.2022.110234>, SurvLIME described in Kovalev et al., (2020) <doi:10.1016/j.knosys.2020.106164> as well as extensions of existing ones described in Biecek et al., (2021) <doi:10.1201/9780429027192>.
Maintained by Mikołaj Spytek. Last updated 9 months ago.
biostatisticsbrier-scorescensored-datacox-modelcox-regressionexplainable-aiexplainable-machine-learningexplainable-mlexplanatory-model-analysisinterpretable-machine-learninginterpretable-mlmachine-learningprobabilistic-machine-learningshapsurvival-analysistime-to-eventvariable-importancexai
16.9 match 110 stars 8.40 score 114 scriptsnorskregnesentral
shapr:Prediction Explanation with Dependence-Aware Shapley Values
Complex machine learning models are often hard to interpret. However, in many situations it is crucial to understand and explain why a model made a specific prediction. Shapley values is the only method for such prediction explanation framework with a solid theoretical foundation. Previously known methods for estimating the Shapley values do, however, assume feature independence. This package implements methods which accounts for any feature dependence, and thereby produces more accurate estimates of the true Shapley values. An accompanying 'Python' wrapper ('shaprpy') is available through the GitHub repository.
Maintained by Martin Jullum. Last updated 1 months ago.
explainable-aiexplainable-mlrcpprcpparmadilloshapleyopenblascppopenmp
12.9 match 153 stars 10.62 score 175 scripts 1 dependentscran
datarobot:'DataRobot' Predictive Modeling API
For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>.
Maintained by AJ Alon. Last updated 1 years ago.
31.0 match 2 stars 3.48 scoremodeloriented
iBreakDown:Model Agnostic Instance Level Variable Attributions
Model agnostic tool for decomposition of predictions from black boxes. Supports additive attributions and attributions with interactions. The Break Down Table shows contributions of every variable to a final prediction. The Break Down Plot presents variable contributions in a concise graphical way. This package works for classification and regression models. It is an extension of the 'breakDown' package (Staniak and Biecek 2018) <doi:10.32614/RJ-2018-072>, with new and faster strategies for orderings. It supports interactions in explanations and has interactive visuals (implemented with 'D3.js' library). The methodology behind is described in the 'iBreakDown' article (Gosiewska and Biecek 2019) <arXiv:1903.11420> This package is a part of the 'DrWhy.AI' universe (Biecek 2018) <arXiv:1806.08915>.
Maintained by Przemyslaw Biecek. Last updated 1 years ago.
breakdownimlinterpretabilityshapleyxai
10.2 match 84 stars 10.07 score 56 scripts 22 dependentsmodeloriented
ingredients:Effects and Importances of Model Ingredients
Collection of tools for assessment of feature importance and feature effects. Key functions are: feature_importance() for assessment of global level feature importance, ceteris_paribus() for calculation of the what-if plots, partial_dependence() for partial dependence plots, conditional_dependence() for conditional dependence plots, accumulated_dependence() for accumulated local effects plots, aggregate_profiles() and cluster_profiles() for aggregation of ceteris paribus profiles, generic print() and plot() for better usability of selected explainers, generic plotD3() for interactive, D3 based explanations, and generic describe() for explanations in natural language. The package 'ingredients' is a part of the 'DrWhy.AI' universe (Biecek 2018) <arXiv:1806.08915>.
Maintained by Przemyslaw Biecek. Last updated 2 years ago.
7.2 match 37 stars 10.38 score 83 scripts 22 dependentsrvlenth
emmeans:Estimated Marginal Means, aka Least-Squares Means
Obtain estimated marginal means (EMMs) for many linear, generalized linear, and mixed models. Compute contrasts or linear functions of EMMs, trends, and comparisons of slopes. Plots and other displays. Least-squares means are discussed, and the term "estimated marginal means" is suggested, in Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>.
Maintained by Russell V. Lenth. Last updated 2 days ago.
3.8 match 377 stars 19.19 score 13k scripts 187 dependentsmodeloriented
auditor:Model Audit - Verification, Validation, and Error Analysis
Provides an easy to use unified interface for creating validation plots for any model. The 'auditor' helps to avoid repetitive work consisting of writing code needed to create residual plots. This visualizations allow to asses and compare the goodness of fit, performance, and similarity of models.
Maintained by Alicja Gosiewska. Last updated 1 years ago.
classificationerror-analysisexplainable-artificial-intelligencemachine-learningmodel-validationregression-modelsresidualsxai
7.6 match 58 stars 8.76 score 94 scripts 2 dependentspbiecek
breakDown:Model Agnostic Explainers for Individual Predictions
Model agnostic tool for decomposition of predictions from black boxes. Break Down Table shows contributions of every variable to a final prediction. Break Down Plot presents variable contributions in a concise graphical way. This package work for binary classifiers and general regression models.
Maintained by Przemyslaw Biecek. Last updated 1 years ago.
data-scienceimlinterpretabilitymachine-learningvisual-explanationsxai
7.5 match 103 stars 8.90 score 91 scripts 2 dependentspbiecek
ceterisParibus:Ceteris Paribus Profiles
Ceteris Paribus Profiles (What-If Plots) are designed to present model responses around selected points in a feature space. For example around a single prediction for an interesting observation. Plots are designed to work in a model-agnostic fashion, they are working for any predictive Machine Learning model and allow for model comparisons. Ceteris Paribus Plots supplement the Break Down Plots from 'breakDown' package.
Maintained by Przemyslaw Biecek. Last updated 5 years ago.
11.8 match 42 stars 5.48 score 36 scriptsmodeloriented
live:Local Interpretable (Model-Agnostic) Visual Explanations
Interpretability of complex machine learning models is a growing concern. This package helps to understand key factors that drive the decision made by complicated predictive model (so called black box model). This is achieved through local approximations that are either based on additive regression like model or CART like model that allows for higher interactions. The methodology is based on Tulio Ribeiro, Singh, Guestrin (2016) <doi:10.1145/2939672.2939778>. More details can be found in Staniak, Biecek (2018) <doi:10.32614/RJ-2018-072>.
Maintained by Mateusz Staniak. Last updated 6 years ago.
imlinterpretabilitylimemachine-learningmodel-visualizationvisual-explanationsxai
10.4 match 35 stars 5.59 score 55 scriptsmodeloriented
localModel:LIME-Based Explanations with Interpretable Inputs Based on Ceteris Paribus Profiles
Local explanations of machine learning models describe, how features contributed to a single prediction. This package implements an explanation method based on LIME (Local Interpretable Model-agnostic Explanations, see Tulio Ribeiro, Singh, Guestrin (2016) <doi:10.1145/2939672.2939778>) in which interpretable inputs are created based on local rather than global behaviour of each original feature.
Maintained by Przemyslaw Biecek. Last updated 3 years ago.
8.4 match 14 stars 6.16 score 23 scriptsbips-hb
innsight:Get the Insights of Your Neural Network
Interpretation methods for analyzing the behavior and individual predictions of modern neural networks in a three-step procedure: Converting the model, running the interpretation method, and visualizing the results. Implemented methods are, e.g., 'Connection Weights' described by Olden et al. (2004) <doi:10.1016/j.ecolmodel.2004.03.013>, layer-wise relevance propagation ('LRP') described by Bach et al. (2015) <doi:10.1371/journal.pone.0130140>, deep learning important features ('DeepLIFT') described by Shrikumar et al. (2017) <doi:10.48550/arXiv.1704.02685> and gradient-based methods like 'SmoothGrad' described by Smilkov et al. (2017) <doi:10.48550/arXiv.1706.03825>, 'Gradient x Input' or 'Vanilla Gradient'. Details can be found in the accompanying scientific paper: Koenen & Wright (2024, Journal of Statistical Software, <doi:10.18637/jss.v111.i08>).
Maintained by Niklas Koenen. Last updated 4 months ago.
7.3 match 30 stars 7.01 score 57 scriptsbertcarnell
tornado:Plots for Model Sensitivity and Variable Importance
Draws tornado plots for model sensitivity to univariate changes. Implements methods for many modeling methods including linear models, generalized linear models, survival regression models, and arbitrary machine learning models in the caret package. Also draws variable importance plots.
Maintained by Rob Carnell. Last updated 7 months ago.
explanabilityregressionsensitivity-analysis
10.0 match 7 stars 4.85 score 4 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
5.8 match 3 stars 8.20 score 7.8k scripts 11 dependentsdietrichson
ProPublicaR:Access Functions for ProPublica's APIs
Provides wrapper functions to access the ProPublica's Congress and Campaign Finance APIs. The Congress API provides near real-time access to legislative data from the House of Representatives, the Senate and the Library of Congress. The Campaign Finance API provides data from United States Federal Election Commission filings and other sources. The API covers summary information for candidates and committees, as well as certain types of itemized data. For more information about these APIs go to: <https://www.propublica.org/datastore/apis>.
Maintained by Aleksander Dietrichson. Last updated 2 years ago.
10.6 match 12 stars 4.38 score 1 scriptsmodeloriented
triplot:Explaining Correlated Features in Machine Learning Models
Tools for exploring effects of correlated features in predictive models. The predict_triplot() function delivers instance-level explanations that calculate the importance of the groups of explanatory variables. The model_triplot() function delivers data-level explanations. The generic plot function visualises in a concise way importance of hierarchical groups of predictors. All of the the tools are model agnostic, therefore works for any predictive machine learning models. Find more details in Biecek (2018) <arXiv:1806.08915>.
Maintained by Katarzyna Pekala. Last updated 4 years ago.
explanationsexplanatory-model-analysismachine-learningmodel-visualizationxai
10.8 match 9 stars 3.65 score 7 scriptshaghish
shapley:Weighted Mean SHAP and CI for Robust Feature Selection in ML Grid
This R package introduces Weighted Mean SHapley Additive exPlanations (WMSHAP), an innovative method for calculating SHAP values for a grid of fine-tuned base-learner machine learning models as well as stacked ensembles, a method not previously available due to the common reliance on single best-performing models. By integrating the weighted mean SHAP values from individual base-learners comprising the ensemble or individual base-learners in a tuning grid search, the package weights SHAP contributions according to each model's performance, assessed by multiple either R squared (for both regression and classification models). alternatively, this software also offers weighting SHAP values based on the area under the precision-recall curve (AUCPR), the area under the curve (AUC), and F2 measures for binary classifiers. It further extends this framework to implement weighted confidence intervals for weighted mean SHAP values, offering a more comprehensive and robust feature importance evaluation over a grid of machine learning models, instead of solely computing SHAP values for the best model. This methodology is particularly beneficial for addressing the severe class imbalance (class rarity) problem by providing a transparent, generalized measure of feature importance that mitigates the risk of reporting SHAP values for an overfitted or biased model and maintains robustness under severe class imbalance, where there is no universal criteria of identifying the absolute best model. Furthermore, the package implements hypothesis testing to ascertain the statistical significance of SHAP values for individual features, as well as comparative significance testing of SHAP contributions between features. Additionally, it tackles a critical gap in feature selection literature by presenting criteria for the automatic feature selection of the most important features across a grid of models or stacked ensembles, eliminating the need for arbitrary determination of the number of top features to be extracted. This utility is invaluable for researchers analyzing feature significance, particularly within severely imbalanced outcomes where conventional methods fall short. Moreover, it is also expected to report democratic feature importance across a grid of models, resulting in a more comprehensive and generalizable feature selection. The package further implements a novel method for visualizing SHAP values both at subject level and feature level as well as a plot for feature selection based on the weighted mean SHAP ratios.
Maintained by E. F. Haghish. Last updated 2 days ago.
class-imbalanceclass-imbalance-problemfeature-extractionfeature-importancefeature-selectionmachine-learningmachine-learning-algorithmsshapshap-analysisshap-valuesshapelyshapley-additive-explanationsshapley-decompositionshapley-valueshapley-valuesshapleyvalueweighted-shapweighted-shap-confidence-intervalweighted-shapleyweighted-shapley-ci
7.2 match 14 stars 5.19 score 17 scriptsyihui
knitr:A General-Purpose Package for Dynamic Report Generation in R
Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.
Maintained by Yihui Xie. Last updated 1 days ago.
dynamic-documentsknitrliterate-programmingrmarkdownsweave
1.3 match 2.4k stars 23.62 score 116k scripts 4.2k dependentsmodeloriented
arenar:Arena for the Exploration and Comparison of any ML Models
Generates data for challenging machine learning models in 'Arena' <https://arena.drwhy.ai> - an interactive web application. You can start the server with XAI (Explainable Artificial Intelligence) plots to be generated on-demand or precalculate and auto-upload data file beside shareable 'Arena' URL.
Maintained by Piotr Piątyszek. Last updated 4 years ago.
axplainable-artificial-intelligenceemaexplainabilityexplanatory-model-analysisimlinteractive-xaiinterpretabilityxai
4.7 match 31 stars 5.94 score 14 scriptsstscl
gdverse:Analysis of Spatial Stratified Heterogeneity
Analyzing spatial factors and exploring spatial associations based on the concept of spatial stratified heterogeneity, while also taking into account local spatial dependencies, spatial interpretability, complex spatial interactions, and robust spatial stratification. Additionally, it supports the spatial stratified heterogeneity family established in academic literature.
Maintained by Wenbo Lv. Last updated 1 days ago.
geographical-detectorgeoinformaticsgeospatial-analysisspatial-statisticsspatial-stratified-heterogeneitycpp
3.0 match 32 stars 9.07 score 41 scripts 2 dependentsnacnudus
unpivotr:Unpivot Complex and Irregular Data Layouts
Tools for converting data from complex or irregular layouts to a columnar structure. For example, tables with multilevel column or row headers, or spreadsheets. Header and data cells are selected by their contents and position, as well as formatting and comments where available, and are associated with one other by their proximity in given directions. Functions for data frames and HTML tables are provided.
Maintained by Duncan Garmonsway. Last updated 1 months ago.
2.5 match 186 stars 10.35 score 368 scripts 3 dependentsbgreenwell
fastshap:Fast Approximate Shapley Values
Computes fast (relative to other implementations) approximate Shapley values for any supervised learning model. Shapley values help to explain the predictions from any black box model using ideas from game theory; see Strumbel and Kononenko (2014) <doi:10.1007/s10115-013-0679-x> for details.
Maintained by Brandon Greenwell. Last updated 1 years ago.
explainable-aiexplainable-mlinterpretable-machine-learningshapleyshapley-valuesvariable-importancexaicpp
3.0 match 118 stars 8.56 score 155 scripts 2 dependentsplantedml
glex:Global Explanations for Tree-Based Models
Global explanations for tree-based models by decomposing regression or classification functions into the sum of main components and interaction components of arbitrary order. Calculates SHAP values and q-interaction SHAP for all values of q for tree-based models such as xgboost.
Maintained by Marvin N. Wright. Last updated 3 days ago.
5.3 match 5 stars 4.75 score 15 scriptsnspyrison
cheem:Interactively Explore Local Explanations with the Radial Tour
Given a non-linear model, calculate the local explanation. We purpose view the data space, explanation space, and model residuals as ensemble graphic interactive on a shiny application. After an observation of interest is identified, the normalized variable importance of the local explanation is used as a 1D projection basis. The support of the local explanation is then explored by changing the basis with the use of the radial tour <doi:10.32614/RJ-2020-027>; <doi:10.1080/10618600.1997.10474754>.
Maintained by Nicholas Spyrison. Last updated 1 years ago.
5.3 match 2 stars 4.73 score 54 scriptsgiuseppec
iml:Interpretable Machine Learning
Interpretability methods to analyze the behavior and predictions of any machine learning model. Implemented methods are: Feature importance described by Fisher et al. (2018) <doi:10.48550/arxiv.1801.01489>, accumulated local effects plots described by Apley (2018) <doi:10.48550/arxiv.1612.08468>, partial dependence plots described by Friedman (2001) <www.jstor.org/stable/2699986>, individual conditional expectation ('ice') plots described by Goldstein et al. (2013) <doi:10.1080/10618600.2014.907095>, local models (variant of 'lime') described by Ribeiro et. al (2016) <doi:10.48550/arXiv.1602.04938>, the Shapley Value described by Strumbelj et. al (2014) <doi:10.1007/s10115-013-0679-x>, feature interactions described by Friedman et. al <doi:10.1214/07-AOAS148> and tree surrogate models.
Maintained by Giuseppe Casalicchio. Last updated 20 days ago.
1.9 match 494 stars 12.86 score 642 scripts 4 dependentsmsberends
AMR:Antimicrobial Resistance Data Analysis
Functions to simplify and standardise antimicrobial resistance (AMR) data analysis and to work with microbial and antimicrobial properties by using evidence-based methods, as described in <doi:10.18637/jss.v104.i03>.
Maintained by Matthijs S. Berends. Last updated 6 hours ago.
amrantimicrobial-dataepidemiologymicrobiologysoftware
2.0 match 92 stars 11.87 score 182 scripts 6 dependentsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 28 days ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
2.0 match 55 stars 11.77 score 1.2k scripts 2 dependentsg-rho
xgrove:Explanation Groves
Compute surrogate explanation groves for predictive machine learning models and analyze complexity vs. explanatory power of an explanation according to Szepannek, G. and von Holt, B. (2023) <doi:10.1007/s41237-023-00205-2>.
Maintained by Gero Szepannek. Last updated 2 months ago.
6.8 match 3.40 score 1 scriptsropensci
allodb:Tree Biomass Estimation at Extra-Tropical Forest Plots
Standardize and simplify the tree biomass estimation process across globally distributed extratropical forests.
Maintained by Erika Gonzalez-Akre. Last updated 9 days ago.
3.8 match 38 stars 5.94 score 38 scriptsbioc
SingleCellExperiment:S4 Classes for Single Cell Data
Defines a S4 class for storing data from single-cell experiments. This includes specialized methods to store and retrieve spike-in information, dimensionality reduction coordinates and size factors for each cell, along with the usual metadata for genes and libraries.
Maintained by Davide Risso. Last updated 8 days ago.
immunooncologydatarepresentationdataimportinfrastructuresinglecell
1.5 match 13.53 score 15k scripts 285 dependentsbimsbbioinfo
mergen:AI-Driven Code Generation, Explanation and Execution for Data Analysis
Employing artificial intelligence to convert data analysis questions into executable code, explanations, and algorithms. The self-correction feature ensures the generated code is optimized for performance and accuracy. 'mergen' features a user-friendly chat interface, enabling users to interact with the AI agent and extract valuable insights from their data effortlessly.
Maintained by Altuna Akalin. Last updated 6 months ago.
3.3 match 17 stars 6.01 score 3 scripts 1 dependentstechtonique
learningmachine:Machine Learning with Explanations and Uncertainty Quantification
Regression-based Machine Learning with explanations and uncertainty quantification.
Maintained by T. Moudiki. Last updated 4 months ago.
conformal-predictionmachine-learningmachine-learning-algorithmsmachinelearningstatistical-learninguncertainty-quantificationcpp
3.6 match 5 stars 5.57 score 21 scriptsbioc
scFeatures:scFeatures: Multi-view representations of single-cell and spatial data for disease outcome prediction
scFeatures constructs multi-view representations of single-cell and spatial data. scFeatures is a tool that generates multi-view representations of single-cell and spatial data through the construction of a total of 17 feature types. These features can then be used for a variety of analyses using other software in Biocondutor.
Maintained by Yue Cao. Last updated 5 months ago.
cellbasedassayssinglecellspatialsoftwaretranscriptomics
3.1 match 10 stars 5.95 score 15 scriptsmodeloriented
modelStudio:Interactive Studio for Explanatory Model Analysis
Automate the explanatory analysis of machine learning predictive models. Generate advanced interactive model explanations in the form of a serverless HTML site with only one line of code. This tool is model-agnostic, therefore compatible with most of the black-box predictive models and frameworks. The main function computes various (instance and model-level) explanations and produces a customisable dashboard, which consists of multiple panels for plots with their short descriptions. It is possible to easily save the dashboard and share it with others. 'modelStudio' facilitates the process of Interactive Explanatory Model Analysis introduced in Baniecki et al. (2023) <doi:10.1007/s10618-023-00924-w>.
Maintained by Hubert Baniecki. Last updated 2 years ago.
aiexplainableexplainable-aiexplainable-machine-learningexplanatory-model-analysishumanimlinteractiveinteractivityinterpretabilityinterpretableinterpretable-machine-learninglearningmachinemodelmodel-visualizationvisualizationxai
2.3 match 330 stars 7.92 score 56 scriptseriqande
rubias:Bayesian Inference from the Conditional Genetic Stock Identification Model
Implements Bayesian inference for the conditional genetic stock identification model. It allows inference of mixed fisheries and also simulation of mixtures to predict accuracy. A full description of the underlying methods is available in a recently published article in the Canadian Journal of Fisheries and Aquatic Sciences: <doi:10.1139/cjfas-2018-0016>.
Maintained by Eric C. Anderson. Last updated 1 years ago.
3.0 match 3 stars 5.90 score 89 scriptsbioc
TDbasedUFEadv:Advanced package of tensor decomposition based unsupervised feature extraction
This is an advanced version of TDbasedUFE, which is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. In contrast to TDbasedUFE which can perform simple the feature selection and the multiomics analyses, this package can perform more complicated and advanced features, but they are not so popularly required. Only users who require more specific features can make use of its functionality.
Maintained by Y-h. Taguchi. Last updated 5 months ago.
geneexpressionfeatureextractionmethylationarraysinglecellsoftwarebioconductor-packagebioinformaticstensor-decomposition
3.8 match 4.48 score 4 scriptslleisong
itsdm:Isolation Forest-Based Presence-Only Species Distribution Modeling
Collection of R functions to do purely presence-only species distribution modeling with isolation forest (iForest) and its variations such as Extended isolation forest and SCiForest. See the details of these methods in references: Liu, F.T., Ting, K.M. and Zhou, Z.H. (2008) <doi:10.1109/ICDM.2008.17>, Hariri, S., Kind, M.C. and Brunner, R.J. (2019) <doi:10.1109/TKDE.2019.2947676>, Liu, F.T., Ting, K.M. and Zhou, Z.H. (2010) <doi:10.1007/978-3-642-15883-4_18>, Guha, S., Mishra, N., Roy, G. and Schrijvers, O. (2016) <https://proceedings.mlr.press/v48/guha16.html>, Cortes, D. (2021) <arXiv:2110.13402>. Additionally, Shapley values are used to explain model inputs and outputs. See details in references: Shapley, L.S. (1953) <doi:10.1515/9781400881970-018>, Lundberg, S.M. and Lee, S.I. (2017) <https://dl.acm.org/doi/abs/10.5555/3295222.3295230>, Molnar, C. (2020) <ISBN:978-0-244-76852-2>, Štrumbelj, E. and Kononenko, I. (2014) <doi:10.1007/s10115-013-0679-x>. itsdm also provides functions to diagnose variable response, analyze variable importance, draw spatial dependence of variables and examine variable contribution. As utilities, the package includes a few functions to download bioclimatic variables including 'WorldClim' version 2.0 (see Fick, S.E. and Hijmans, R.J. (2017) <doi:10.1002/joc.5086>) and 'CMCC-BioClimInd' (see Noce, S., Caporaso, L. and Santini, M. (2020) <doi:10.1038/s41597-020-00726-5>.
Maintained by Lei Song. Last updated 2 years ago.
isolation-forestoutlier-detectionpresence-onlymodelshapley-valuespecies-distribution-modelling
3.0 match 4 stars 5.59 score 65 scriptsmodeloriented
EIX:Explain Interactions in 'XGBoost'
Structure mining from 'XGBoost' and 'LightGBM' models. Key functionalities of this package cover: visualisation of tree-based ensembles models, identification of interactions, measuring of variable importance, measuring of interaction importance, explanation of single prediction with break down plots (based on 'xgboostExplainer' and 'iBreakDown' packages). To download the 'LightGBM' use the following link: <https://github.com/Microsoft/LightGBM>. 'EIX' is a part of the 'DrWhy.AI' universe.
Maintained by Ewelina Karbowiak. Last updated 4 years ago.
2.9 match 26 stars 5.72 score 6 scriptsjsugarelli
xplain:Providing Interactive Interpretations and Explanations of Statistical Results
Allows to provide live interpretations and explanations of statistical functions in R. These interpretations and explanations are shown when the explained function is called by the user. They can interact with the values of the explained function's actual results to offer relevant, meaningful insights. The 'xplain' interpretations and explanations are based on an easy-to-use XML format that allows to include R code to interact with the returns of the explained function.
Maintained by Joachim Zuckarelli. Last updated 5 years ago.
3.9 match 25 stars 4.10 score 4 scriptsbrry
rdwd:Select and Download Climate Data from 'DWD' (German Weather Service)
Handle climate data from the 'DWD' ('Deutscher Wetterdienst', see <https://www.dwd.de/EN/climate_environment/cdc/cdc_node_en.html> for more information). Choose observational time series from meteorological stations with 'selectDWD()'. Find raster data from radar and interpolation according to <https://bookdown.org/brry/rdwd/raster-data.html>. Download (multiple) data sets with progress bars and no re-downloads through 'dataDWD()'. Read both tabular observational data and binary gridded datasets with 'readDWD()'.
Maintained by Berry Boessenkool. Last updated 5 days ago.
2.0 match 73 stars 7.77 score 79 scriptsbart1
move2:Processing and Analysing Animal Trajectories
Tools to handle, manipulate and explore trajectory data, with an emphasis on data from tracked animals. The package is designed to support large studies with several million location records and keep track of units where possible. Data import directly from 'movebank' <https://www.movebank.org/cms/movebank-main> and files is facilitated.
Maintained by Bart Kranstauber. Last updated 1 months ago.
2.0 match 7.51 score 169 scripts 1 dependentscran
pBrackets:Plot Brackets
Adds different kinds of brackets to a plot, including braces, chevrons, parentheses or square brackets.
Maintained by Andreas Schulz. Last updated 4 years ago.
3.3 match 4.02 score 348 scripts 1 dependentsdoi-usgs
EGRET:Exploration and Graphics for RivEr Trends
Statistics and graphics for streamflow history, water quality trends, and the statistical modeling algorithm: Weighted Regressions on Time, Discharge, and Season (WRTDS).
Maintained by Laura DeCicco. Last updated 4 months ago.
usgswater-qualitywater-quality-data
1.3 match 90 stars 10.72 score 362 scripts 1 dependentsmatt-dray
tamRgo:Digital Pets for R
Store a persistent digital pet on your computer and interact with it in your R console.
Maintained by Matt Dray. Last updated 2 years ago.
3.8 match 7 stars 3.54 score 4 scriptscloudyr
googleCloudStorageR:Interface with Google Cloud Storage API
Interact with Google Cloud Storage <https://cloud.google.com/storage/> API in R. Part of the 'cloudyr' <https://cloudyr.github.io/> project.
Maintained by Mark Edmondson. Last updated 3 days ago.
apiapi-clientgoogle-cloud-storagegoogleauthr
1.3 match 104 stars 10.28 score 548 scripts 1 dependentsmerck
metalite.ae:Adverse Events Analysis Using 'metalite'
Analyzes adverse events in clinical trials using the 'metalite' data structure. The package simplifies the workflow to create production-ready tables, listings, and figures discussed in the adverse events analysis chapters of "R for Clinical Study Reports and Submission" by Zhang et al. (2022) <https://r4csr.org/>.
Maintained by Yujie Zhao. Last updated 1 months ago.
1.3 match 18 stars 9.45 score 31 scripts 2 dependentscran
mpwR:Standardized Comparison of Workflows in Mass Spectrometry-Based Bottom-Up Proteomics
Useful functions to analyze proteomic workflows including number of identifications, data completeness, missed cleavages, quantitative and retention time precision etc. Various software outputs are supported such as 'ProteomeDiscoverer', 'Spectronaut', 'DIA-NN' and 'MaxQuant'.
Maintained by Oliver Kardell. Last updated 1 years ago.
3.8 match 3.30 scorebioc
amplican:Automated analysis of CRISPR experiments
`amplican` performs alignment of the amplicon reads, normalizes gathered data, calculates multiple statistics (e.g. cut rates, frameshifts) and presents results in form of aggregated reports. Data and statistics can be broken down by experiments, barcodes, user defined groups, guides and amplicons allowing for quick identification of potential problems.
Maintained by Eivind Valen. Last updated 5 months ago.
immunooncologytechnologyalignmentqpcrcrisprcpp
1.5 match 10 stars 7.54 score 41 scriptsepiverse-trace
cfr:Estimate Disease Severity and Case Ascertainment
Estimate the severity of a disease and ascertainment of cases, as discussed in Nishiura et al. (2009) <doi:10.1371/journal.pone.0006852>.
Maintained by Adam Kucharski. Last updated 16 days ago.
case-fatality-rateepidemic-modellingepidemiologyepiversehealth-outcomesoutbreak-analysissdg-3
1.3 match 13 stars 8.15 score 35 scriptsmodeloriented
fairmodels:Flexible Tool for Bias Detection, Visualization, and Mitigation
Measure fairness metrics in one place for many models. Check how big is model's bias towards different races, sex, nationalities etc. Use measures such as Statistical Parity, Equal odds to detect the discrimination against unprivileged groups. Visualize the bias using heatmap, radar plot, biplot, bar chart (and more!). There are various pre-processing and post-processing bias mitigation algorithms implemented. Package also supports calculating fairness metrics for regression models. Find more details in (Wiśniewski, Biecek (2021)) <arXiv:2104.00507>.
Maintained by Jakub Wiśniewski. Last updated 1 months ago.
explain-classifiersexplainable-mlfairnessfairness-comparisonfairness-mlmodel-evaluation
1.3 match 86 stars 7.72 score 51 scripts 1 dependentsstatnet
tsna:Tools for Temporal Social Network Analysis
Temporal SNA tools for continuous- and discrete-time longitudinal networks having vertex, edge, and attribute dynamics stored in the 'networkDynamic' format. This work was supported by grant R01HD68395 from the National Institute of Health.
Maintained by Skye Bender-deMoll. Last updated 1 years ago.
1.3 match 7 stars 7.65 score 93 scripts 2 dependentsbioc
StructuralVariantAnnotation:Variant annotations for structural variants
StructuralVariantAnnotation provides a framework for analysis of structural variants within the Bioconductor ecosystem. This package contains contains useful helper functions for dealing with structural variants in VCF format. The packages contains functions for parsing VCFs from a number of popular callers as well as functions for dealing with breakpoints involving two separate genomic loci encoded as GRanges objects.
Maintained by Daniel Cameron. Last updated 5 months ago.
dataimportsequencingannotationgeneticsvariantannotation
1.6 match 6.26 score 102 scripts 2 dependentsoccupationmeasurement
occupationMeasurement:Interactively Measure Occupations in Interviews and Beyond
Perform interactive occupation coding during interviews as described in Peycheva, D., Sakshaug, J., Calderwood, L. (2021) <doi:10.2478/jos-2021-0042> and Schierholz, M., Gensicke, M., Tschersich, N., Kreuter, F. (2018) <doi:10.1111/rssa.12297>. Generate suggestions for occupational categories based on free text input, with pre-trained machine learning models in German and a ready-to-use shiny application provided for quick and easy data collection.
Maintained by Jan Simson. Last updated 7 months ago.
1.9 match 3 stars 5.18 score 17 scriptsagroscope-ch
srppp:Read the Swiss Register of Plant Protection Products
Generate data objects from XML versions of the Swiss Register of Plant Protection Products. An online version of the register can be accessed at <https://www.psm.admin.ch/de/produkte>. There is no guarantee of correspondence of the data read in using this package with that online version, or with the original registration documents. Also, the Federal Food Safety and Veterinary Office, coordinating the authorisation of plant protection products in Switzerland, does not answer requests regarding this package.
Maintained by Johannes Ranke. Last updated 24 days ago.
1.5 match 6.31 score 20 scripts 1 dependentsrmarko
ExplainPrediction:Explanation of Predictions for Classification and Regression Models
Generates explanations for classification and regression models and visualizes them. Explanations are generated for individual predictions as well as for models as a whole. Two explanation methods are included, EXPLAIN and IME. The EXPLAIN method is fast but might miss explanations expressed redundantly in the model. The IME method is slower as it samples from all feature subsets. For the EXPLAIN method see Robnik-Sikonja and Kononenko (2008) <doi:10.1109/TKDE.2007.190734>, and the IME method is described in Strumbelj and Kononenko (2010, JMLR, vol. 11:1-18). All models in package 'CORElearn' are natively supported, for other prediction models a wrapper function is provided and illustrated for models from packages 'randomForest', 'nnet', and 'e1071'.
Maintained by Marko Robnik-Sikonja. Last updated 7 years ago.
9.5 match 1.00 score 3 scriptsjaredhuling
personalized:Estimation and Validation Methods for Subgroup Identification and Personalized Medicine
Provides functions for fitting and validation of models for subgroup identification and personalized medicine / precision medicine under the general subgroup identification framework of Chen et al. (2017) <doi:10.1111/biom.12676>. This package is intended for use for both randomized controlled trials and observational studies and is described in detail in Huling and Yu (2021) <doi:10.18637/jss.v098.i05>.
Maintained by Jared Huling. Last updated 3 years ago.
causal-inferenceheterogeneity-of-treatment-effectindividualized-treatment-rulespersonalized-medicineprecision-medicinesubgroup-identificationtreatment-effectstreatment-scoring
1.3 match 32 stars 7.38 score 125 scripts 1 dependentsjohnmackintosh
runcharter:Automatically Plot, Analyse and Revises Limits of Multiple Run Charts
Plots multiple run charts, finds successive signals of improvement, and revises medians when each signal occurs. Finds runs above, below, or on both sides of the median, and returns a plot and a data.table summarising original medians and any revisions, for all groups within the supplied data.
Maintained by John MacKintosh. Last updated 3 years ago.
chart-analysishealthcarenhsnhs-r-communityquality-controlquality-improvementquality-improvement-effortsrdatatable
1.5 match 38 stars 6.11 score 17 scriptsbioc
ramwas:Fast Methylome-Wide Association Study Pipeline for Enrichment Platforms
A complete toolset for methylome-wide association studies (MWAS). It is specifically designed for data from enrichment based methylation assays, but can be applied to other data as well. The analysis pipeline includes seven steps: (1) scanning aligned reads from BAM files, (2) calculation of quality control measures, (3) creation of methylation score (coverage) matrix, (4) principal component analysis for capturing batch effects and detection of outliers, (5) association analysis with respect to phenotypes of interest while correcting for top PCs and known covariates, (6) annotation of significant findings, and (7) multi-marker analysis (methylation risk score) using elastic net. Additionally, RaMWAS include tools for joint analysis of methlyation and genotype data. This work is published in Bioinformatics, Shabalin et al. (2018) <doi:10.1093/bioinformatics/bty069>.
Maintained by Andrey A Shabalin. Last updated 5 months ago.
dnamethylationsequencingqualitycontrolcoveragepreprocessingnormalizationbatcheffectprincipalcomponentdifferentialmethylationvisualization
1.5 match 10 stars 6.08 score 85 scriptsstevenmmortimer
rdfp:An Implementation of the 'DoubleClick for Publishers' API
Functions to interact with the 'Google DoubleClick for Publishers (DFP)' API <https://developers.google.com/ad-manager/api/start> (recently renamed to 'Google Ad Manager'). This package is automatically compiled from the API WSDL (Web Service Description Language) files to dictate how the API is structured. Theoretically, all API actions are possible using this package; however, care must be taken to format the inputs correctly and parse the outputs correctly. Please see the 'Google Ad Manager' API reference <https://developers.google.com/ad-manager/api/rel_notes> and this package's website <https://stevenmmortimer.github.io/rdfp/> for more information, documentation, and examples.
Maintained by Steven M. Mortimer. Last updated 6 years ago.
api-clientapi-wrapperdfpdfp-apidoubleclickdoubleclick-for-publishersgoogle-dfp
1.3 match 16 stars 6.93 score 214 scriptsbioc
demuxSNP:scRNAseq demultiplexing using cell hashing and SNPs
This package assists in demultiplexing scRNAseq data using both cell hashing and SNPs data. The SNP profile of each group os learned using high confidence assignments from the cell hashing data. Cells which cannot be assigned with high confidence from the cell hashing data are assigned to their most similar group based on their SNPs. We also provide some helper function to optimise SNP selection, create training data and merge SNP data into the SingleCellExperiment framework.
Maintained by Michael Lynch. Last updated 5 months ago.
1.5 match 6 stars 5.60 score 22 scriptsrobindenz1
CareDensity:Calculate the Care Density or Fragmented Care Density Given a Patient-Sharing Network
Given a patient-sharing network, calculate either the classic care density as proposed by Pollack et al. (2013) <doi:10.1007/s11606-012-2104-7> or the fragmented care density as proposed by Engels et al. (2024) <doi:10.1186/s12874-023-02106-0>. By utilizing the 'igraph' and 'data.table' packages, the provided functions scale well for very large graphs.
Maintained by Robin Denz. Last updated 4 months ago.
care-coordinationnetwork-analysispatient-care
2.0 match 1 stars 4.18 score 6 scriptsmlindsk
jti:Junction Tree Inference
Minimal and memory efficient implementation of the junction tree algorithm using the Lauritzen-Spiegelhalter scheme; S. L. Lauritzen and D. J. Spiegelhalter (1988) <https://www.jstor.org/stable/2345762?seq=1>.
Maintained by Mads Lindskou. Last updated 3 years ago.
2.3 match 1 stars 3.70 score 8 scriptsdppalomar
sparseEigen:Computation of Sparse Eigenvectors of a Matrix
Computation of sparse eigenvectors of a matrix (aka sparse PCA) with running time 2-3 orders of magnitude lower than existing methods and better final performance in terms of recovery of sparsity pattern and estimation of numerical values. Can handle covariance matrices as well as data matrices with real or complex-valued entries. Different levels of sparsity can be specified for each individual ordered eigenvector and the method is robust in parameter selection. See vignette for a detailed documentation and comparison, with several illustrative examples. The package is based on the paper: K. Benidis, Y. Sun, P. Babu, and D. P. Palomar, "Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation," IEEE Transactions on Signal Processing, IEEE Trans. on Signal Processing, vol. 64, no. 23, pp. 6211-6226, Dec. 2016. <doi:10.1109/TSP.2016.2605073>.
Maintained by Daniel P. Palomar. Last updated 6 years ago.
covariance-matrixeigenvectorspcasparse
1.5 match 12 stars 5.42 score 22 scriptsaryanrzn
ATE.ERROR:Estimating ATE with Misclassified Outcomes and Mismeasured Covariates
Addressing measurement error in covariates and misclassification in binary outcome variables within causal inference, the 'ATE.ERROR' package implements inverse probability weighted estimation methods proposed by Shu and Yi (2017, <doi:10.1177/0962280217743777>; 2019, <doi:10.1002/sim.8073>). These methods correct errors to accurately estimate average treatment effects (ATE). The package includes two main functions: ATE.ERROR.Y() for handling misclassification in the outcome variable and ATE.ERROR.XY() for correcting both outcome misclassification and covariate measurement error. It employs logistic regression for treatment assignment and uses bootstrap sampling to calculate standard errors and confidence intervals, with simulated datasets provided for practical demonstration.
Maintained by Aryan Rezanezhad. Last updated 6 months ago.
2.0 match 3.71 score 16 scriptsgesistsa
grafzahl:Supervised Machine Learning for Textual Data Using Transformers and 'Quanteda'
Duct tape the 'quanteda' ecosystem (Benoit et al., 2018) <doi:10.21105/joss.00774> to modern Transformer-based text classification models (Wolf et al., 2020) <doi:10.18653/v1/2020.emnlp-demos.6>, in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of 'quanteda.textmodels' and provides a function to setup the 'Python' environment to use the pretrained models from 'Hugging Face' <https://huggingface.co/>. More information: <doi:10.5117/CCR2023.1.003.CHAN>.
Maintained by Chung-hong Chan. Last updated 25 days ago.
1.3 match 41 stars 5.91 score 3 scriptsemmaskarstein
inlamemi:Missing Data and Measurement Error Modelling in INLA
Facilitates fitting measurement error and missing data imputation models using integrated nested Laplace approximations, according to the method described in Skarstein, Martino and Muff (2023) <doi:10.1002/bimj.202300078>. See Skarstein and Muff (2024) <doi:10.48550/arXiv.2406.08172> for details on using the package.
Maintained by Emma Skarstein. Last updated 4 months ago.
1.2 match 5.97 score 19 scriptstushiqi
MAnorm2:Tools for Normalizing and Comparing ChIP-seq Samples
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the premier technology for profiling genome-wide localization of chromatin-binding proteins, including transcription factors and histones with various modifications. This package provides a robust method for normalizing ChIP-seq signals across individual samples or groups of samples. It also designs a self-contained system of statistical models for calling differential ChIP-seq signals between two or more biological conditions as well as for calling hypervariable ChIP-seq signals across samples. Refer to Tu et al. (2021) <doi:10.1101/gr.262675.120> and Chen et al. (2022) <doi:10.1186/s13059-022-02627-9> for associated statistical details.
Maintained by Shiqi Tu. Last updated 2 years ago.
chip-seqdifferential-analysisempirical-bayeswinsorize-values
1.3 match 32 stars 5.48 score 19 scriptsjmzobitz
neonSoilFlux:Compute Soil Carbon Fluxes for the National Ecological Observatory Network Sites
Acquires and synthesizes soil carbon fluxes at sites located in the National Ecological Observatory Network (NEON). Provides flux estimates and associated uncertainty as well as key environmental measurements (soil water, temperature, CO2 concentration) that are used to compute soil fluxes.
Maintained by John Zobitz. Last updated 10 months ago.
1.3 match 6 stars 5.43 score 2 scriptsmodeloriented
shapper:Wrapper of Python Library 'shap'
Provides SHAP explanations of machine learning models. In applied machine learning, there is a strong belief that we need to strike a balance between interpretability and accuracy. However, in field of the Interpretable Machine Learning, there are more and more new ideas for explaining black-box models. One of the best known method for local explanations is SHapley Additive exPlanations (SHAP) introduced by Lundberg, S., et al., (2016) <arXiv:1705.07874> The SHAP method is used to calculate influences of variables on the particular observation. This method is based on Shapley values, a technique used in game theory. The R package 'shapper' is a port of the Python library 'shap'.
Maintained by Szymon Maksymiuk. Last updated 2 years ago.
0.9 match 58 stars 7.31 score 59 scriptsmickmioduszewski
busdater:Standard Date Calculations for Business
Get a current financial year, start of current month, End of current month, start of financial year and end of it. Allow for offset from the date.
Maintained by Mick Mioduszewski. Last updated 3 years ago.
1.5 match 4.36 score 23 scriptsedwinkipruto
mfp2:Multivariable Fractional Polynomial Models with Extensions
Multivariable fractional polynomial algorithm simultaneously selects variables and functional forms in both generalized linear models and Cox proportional hazard models. Key references are Royston and Altman (1994) <doi:10.2307/2986270> and Royston and Sauerbrei (2008, ISBN:978-0-470-02842-1). In addition, it can model a sigmoid relationship between variable x and an outcome variable y using the approximate cumulative distribution transformation proposed by Royston (2014) <doi:10.1177/1536867X1401400206>. This feature distinguishes it from a standard fractional polynomial function, which lacks the ability to achieve such modeling.
Maintained by Edwin Kipruto. Last updated 10 months ago.
1.2 match 3 stars 5.26 score 4 scripts 2 dependentszjwinn
HaploCatcher:A Predictive Haplotyping Package
Used for predicting a genotype’s allelic state at a specific locus/QTL/gene. This is accomplished by using both a genotype matrix and a separate file which has categorizations about loci/QTL/genes of interest for the individuals in the genotypic matrix. A training population can be created from a panel of individuals who have been previously screened for specific loci/QTL/genes, and this previous screening could be summarized into a category. Using the categorization of individuals which have been genotyped using a genome wide marker platform, a model can be trained to predict what category (haplotype) an individual belongs in based on their genetic sequence in the region associated with the locus/QTL/gene. These trained models can then be used to predict the haplotype of a locus/QTL/gene for individuals which have been genotyped with a genome wide platform yet not genotyped for the specific locus/QTL/gene. This package is based off work done by Winn et al 2021. For more specific information on this method, refer to <doi:10.1007/s00122-022-04178-w>.
Maintained by Zachary Winn. Last updated 11 months ago.
genesgeneticslocusmachine-learningmolecular-geneticspipelineplantbreeding
1.5 match 4.18 score 3 scriptsseokhoonj
kosis:Korean Statistical Information Service (KOSIS)
API wrapper to download statistical information from the Korean Statistical Information Service (KOSIS) <https://kosis.kr/openapi/index/index.jsp>.
Maintained by Seokhoon Joo. Last updated 6 months ago.
2.0 match 2 stars 3.04 score 11 scriptsdaedaluslab
multimolang:'multimolang': Multimodal Language Analysis
Process 'OpenPose' human body keypoints for computer vision, including data structuring and user-defined linear transformations for standardization. It optionally, includes metadata extraction from filenames in the UCLA 'NewsScape' archive.
Maintained by Brian Herreño Jiménez. Last updated 3 months ago.
1.5 match 1 stars 4.00 score 4 scriptsbioc
Uniquorn:Identification of cancer cell lines based on their weighted mutational/ variational fingerprint
'Uniquorn' enables users to identify cancer cell lines. Cancer cell line misidentification and cross-contamination reprents a significant challenge for cancer researchers. The identification is vital and in the frame of this package based on the locations/ loci of somatic and germline mutations/ variations. The input format is vcf/ vcf.gz and the files have to contain a single cancer cell line sample (i.e. a single member/genotype/gt column in the vcf file).
Maintained by Raik Otto. Last updated 5 months ago.
immunooncologystatisticalmethodwholegenomeexomeseq
1.3 match 4.30 scorebioc
shinyepico:ShinyÉPICo
ShinyÉPICo is a graphical pipeline to analyze Illumina DNA methylation arrays (450k or EPIC). It allows to calculate differentially methylated positions and differentially methylated regions in a user-friendly interface. Moreover, it includes several options to export the results and obtain files to perform downstream analysis.
Maintained by Octavio Morante-Palacios. Last updated 5 months ago.
differentialmethylationdnamethylationmicroarraypreprocessingqualitycontrol
1.1 match 5 stars 5.00 score 1 scriptsbioc
AMOUNTAIN:Active modules for multilayer weighted gene co-expression networks: a continuous optimization approach
A pure data-driven gene network, weighted gene co-expression network (WGCN) could be constructed only from expression profile. Different layers in such networks may represent different time points, multiple conditions or various species. AMOUNTAIN aims to search active modules in multi-layer WGCN using a continuous optimization approach.
Maintained by Dong Li. Last updated 5 months ago.
geneexpressionmicroarraydifferentialexpressionnetworkgsl
1.5 match 3.78 score 1 scripts 1 dependentscran
ShapleyOutlier:Multivariate Outlier Explanations using Shapley Values and Mahalanobis Distances
Based on Shapley values to explain multivariate outlyingness and to detect and impute cellwise outliers. Includes implementations of methods described in Mayrhofer and Filzmoser (2023) <doi:10.1016/j.ecosta.2023.04.003>.
Maintained by Marcus Mayrhofer. Last updated 5 months ago.
2.8 match 2.00 score 3 scriptsbioc
mitoClone2:Clonal Population Identification in Single-Cell RNA-Seq Data using Mitochondrial and Somatic Mutations
This package primarily identifies variants in mitochondrial genomes from BAM alignment files. It filters these variants to remove RNA editing events then estimates their evolutionary relationship (i.e. their phylogenetic tree) and groups single cells into clones. It also visualizes the mutations and providing additional genomic context.
Maintained by Benjamin Story. Last updated 5 months ago.
annotationdataimportgeneticssnpsoftwaresinglecellalignmentcurlbzip2xz-utilszlibcpp
1.3 match 1 stars 4.48 score 9 scriptsarchaeothommy
chronochrt:Creating Chronological Charts
Easy way to draw chronological charts from tables, aiming to include an intuitive environment for anyone new to R. Includes 'ggplot2' geoms and theme for chronological charts.
Maintained by Thomas Rose. Last updated 6 months ago.
1.3 match 4.00 score 6 scriptsmodeloriented
shapviz:SHAP Visualizations
Visualizations for SHAP (SHapley Additive exPlanations), such as waterfall plots, force plots, various types of importance plots, dependence plots, and interaction plots. These plots act on a 'shapviz' object created from a matrix of SHAP values and a corresponding feature dataset. Wrappers for the R packages 'xgboost', 'lightgbm', 'fastshap', 'shapr', 'h2o', 'treeshap', 'DALEX', and 'kernelshap' are added for convenience. By separating visualization and computation, it is possible to display factor variables in graphs, even if the SHAP values are calculated by a model that requires numerical features. The plots are inspired by those provided by the 'shap' package in Python, but there is no dependency on it.
Maintained by Michael Mayer. Last updated 2 months ago.
explainable-aimachine-learningshapshapley-valuevisualizationxai
0.5 match 89 stars 9.95 score 250 scriptstim-salabim
remote:Empirical Orthogonal Teleconnections in R
Empirical orthogonal teleconnections in R. 'remote' is short for 'R(-based) EMpirical Orthogonal TEleconnections'. It implements a collection of functions to facilitate empirical orthogonal teleconnection analysis. Empirical Orthogonal Teleconnections (EOTs) denote a regression based approach to decompose spatio-temporal fields into a set of independent orthogonal patterns. They are quite similar to Empirical Orthogonal Functions (EOFs) with EOTs producing less abstract results. In contrast to EOFs, which are orthogonal in both space and time, EOT analysis produces patterns that are orthogonal in either space or time.
Maintained by Tim Appelhans. Last updated 8 years ago.
1.8 match 2.79 score 100 scriptsraphaelhartmann
rtmpt:Fitting (Exponential/Diffusion) RT-MPT Models
Fit (exponential or diffusion) response-time extended multinomial processing tree (RT-MPT) models by Klauer and Kellen (2018) <doi:10.1016/j.jmp.2017.12.003> and Klauer, Hartmann, and Meyer-Grant (submitted). The RT-MPT class not only incorporate frequencies like traditional multinomial processing tree (MPT) models, but also latencies. This enables it to estimate process completion times and encoding plus motor execution times next to the process probabilities of traditional MPTs. 'rtmpt' is a hierarchical Bayesian framework and posterior samples are sampled using a Metropolis-within-Gibbs sampler (for exponential RT-MPTs) or Hamiltonian-within-Gibbs sampler (for diffusion RT-MPTs).
Maintained by Raphael Hartmann. Last updated 3 months ago.
1.3 match 4.00 score 4 scriptsliuyanguu
SHAPforxgboost:SHAP Plots for 'XGBoost'
Aid in visual data investigations using SHAP (SHapley Additive exPlanation) visualization plots for 'XGBoost' and 'LightGBM'. It provides summary plot, dependence plot, interaction plot, and force plot and relies on the SHAP implementation provided by 'XGBoost' and 'LightGBM'. Please refer to 'slundberg/shap' for the original implementation of SHAP in 'Python'.
Maintained by Yang Liu. Last updated 12 months ago.
0.5 match 110 stars 8.86 score 284 scripts 1 dependentscran
robustmatrix:Robust Matrix-Variate Parameter Estimation
Robust covariance estimation for matrix-valued data and data with Kronecker-covariance structure using the Matrix Minimum Covariance Determinant (MMCD) estimators and outlier explanation using and Shapley values.
Maintained by Marcus Mayrhofer. Last updated 5 months ago.
2.2 match 2.00 score 2 scriptsbioc
SeqGate:Filtering of Lowly Expressed Features
Filtering of lowly expressed features (e.g. genes) is a common step before performing statistical analysis, but an arbitrary threshold is generally chosen. SeqGate implements a method that rationalize this step by the analysis of the distibution of counts in replicate samples. The gate is the threshold above which sequenced features can be considered as confidently quantified.
Maintained by Stéphanie Rialle. Last updated 5 months ago.
differentialexpressiongeneexpressiontranscriptomicssequencingrnaseq
1.3 match 3.30 score 3 scriptsropensci
geotargets:'Targets' Extensions for Geographic Spatial Formats
Provides extensions for various geographic spatial file formats, such as shape files and rasters. Currently provides support for the 'terra' geographic spatial formats. See the vignettes for worked examples, demonstrations, and explanations of how to use the various package extensions.
Maintained by Nicholas Tierney. Last updated 3 days ago.
geospatialpipeliner-targetopiarasterreproducibilityreproducible-researchtargetsvectorworkflow
0.5 match 72 stars 6.78 scoremodeloriented
treeshap:Compute SHAP Values for Your Tree-Based Models Using the 'TreeSHAP' Algorithm
An efficient implementation of the 'TreeSHAP' algorithm introduced by Lundberg et al., (2020) <doi:10.1038/s42256-019-0138-9>. It is capable of calculating SHAP (SHapley Additive exPlanations) values for tree-based models in polynomial time. Currently supported models include 'gbm', 'randomForest', 'ranger', 'xgboost', 'lightgbm'.
Maintained by Mateusz Krzyzinski. Last updated 1 years ago.
explainabilityexplainable-aiexplainable-artificial-intelligenceexplanatory-model-analysisimlinterpretabilityinterpretable-machine-learningmachine-learningresponsible-mlshapshapley-valuexaicpp
0.5 match 82 stars 6.69 score 170 scriptsagnerf
BiasedUrn:Biased Urn Model Distributions
Statistical models of biased sampling in the form of univariate and multivariate noncentral hypergeometric distributions, including Wallenius' noncentral hypergeometric distribution and Fisher's noncentral hypergeometric distribution. See vignette("UrnTheory") for explanation of these distributions. Literature: Fog, A. (2008a). Calculation Methods for Wallenius' Noncentral Hypergeometric Distribution, Communications in Statistics, Simulation and Computation, 37(2) <doi:10.1080/03610910701790269>. Fog, A. (2008b). Sampling methods for Wallenius’ and Fisher’s noncentral hypergeometric distributions, Communications in Statistics—Simulation and Computation, 37(2) <doi:10.1080/03610910701790236>.
Maintained by Agner Fog. Last updated 9 months ago.
0.5 match 6.58 score 54 scripts 39 dependentskylegrealis
froggeR:Enhance 'Quarto' Project Workflows and Standards
Streamlines 'Quarto' workflows by providing tools for consistent project setup and documentation. Enables portability through reusable metadata, automated project structure creation, and standardized templates. Features include enhanced project initialization, pre-formatted 'Quarto' documents, comprehensive data protection settings, custom styling, and structured documentation generation. Designed to improve efficiency and collaboration in R data science projects by reducing repetitive setup tasks while maintaining consistent formatting across multiple documents. There are many valuable resources providing in-depth explanations of customizing 'Quarto' templates and theme styling by the Posit team: <https://quarto.org/docs/output-formats/html-themes.html#customizing-themes> & <https://quarto.org/docs/output-formats/html-themes-more.html>, and at the Bootstrap community's GitHub at <https://github.com/twbs/bootstrap/blob/main/scss/_variables.scss>.
Maintained by Kyle Grealis. Last updated 2 months ago.
data-scienceproject-managementquarto
0.5 match 25 stars 6.62 score 6 scriptsahgroup
DSAIDE:Dynamical Systems Approach to Infectious Disease Epidemiology (Ecology/Evolution)
Exploration of simulation models (apps) of various infectious disease transmission dynamics scenarios. The purpose of the package is to help individuals learn about infectious disease epidemiology (ecology/evolution) from a dynamical systems perspective. All apps include explanations of the underlying models and instructions on what to do with the models.
Maintained by Andreas Handel. Last updated 1 years ago.
0.5 match 26 stars 6.30 score 22 scriptsahgroup
DSAIRM:Dynamical Systems Approach to Immune Response Modeling
Simulation models (apps) of various within-host immune response scenarios. The purpose of the package is to help individuals learn about within-host infection and immune response modeling from a dynamical systems perspective. All apps include explanations of the underlying models and instructions on what to do with the models.
Maintained by Andreas Handel. Last updated 8 months ago.
0.5 match 34 stars 6.19 score 23 scriptscran
tRnslate:Translate R Code in Source Files
Evaluate inline or chunks of R code in template files and replace with their output modifying the resulting template.
Maintained by Mario A. Martinez Araya. Last updated 4 years ago.
1.3 match 2.48 score 1 dependentsbioc
ompBAM:C++ Library for OpenMP-based multi-threaded sequential profiling of Binary Alignment Map (BAM) files
This packages provides C++ header files for developers wishing to create R packages that processes BAM files. ompBAM automates file access, memory management, and handling of multiple threads 'behind the scenes', so developers can focus on creating domain-specific functionality. The included vignette contains detailed documentation of this API, including quick-start instructions to create a new ompBAM-based package, and step-by-step explanation of the functionality behind the example packaged included within ompBAM.
Maintained by Alex Chit Hei Wong. Last updated 5 months ago.
alignmentdataimportrnaseqsoftwaresequencingtranscriptomicssinglecell
0.5 match 4 stars 5.78 score 3 scripts 1 dependentsholgstr
fmeffects:Model-Agnostic Interpretations with Forward Marginal Effects
Create local, regional, and global explanations for any machine learning model with forward marginal effects. You provide a model and data, and 'fmeffects' computes feature effects. The package is based on the theory in: C. A. Scholbeck, G. Casalicchio, C. Molnar, B. Bischl, and C. Heumann (2022) <doi:10.48550/arXiv.2201.08837>.
Maintained by Holger Löwe. Last updated 4 months ago.
0.5 match 2 stars 5.73 score 6 scriptssteve-the-bayesian
BoomSpikeSlab:MCMC for Spike and Slab Regression
Spike and slab regression with a variety of residual error distributions corresponding to Gaussian, Student T, probit, logit, SVM, and a few others. Spike and slab regression is Bayesian regression with prior distributions containing a point mass at zero. The posterior updates the amount of mass on this point, leading to a posterior distribution that is actually sparse, in the sense that if you sample from it many coefficients are actually zeros. Sampling from this posterior distribution is an elegant way to handle Bayesian variable selection and model averaging. See <DOI:10.1504/IJMMNO.2014.059942> for an explanation of the Gaussian case.
Maintained by Steven L. Scott. Last updated 1 years ago.
0.5 match 6 stars 5.46 score 95 scripts 5 dependentsgorkang
BayesianReasoning:Plot Positive and Negative Predictive Values for Medical Tests
Functions to plot and help understand positive and negative predictive values (PPV and NPV), and their relationship with sensitivity, specificity, and prevalence. See Akobeng, A.K. (2007) <doi:10.1111/j.1651-2227.2006.00180.x> for a theoretical overview of the technical concepts and Navarrete et al. (2015) for a practical explanation about the importance of their understanding <doi:10.3389/fpsyg.2015.01327>.
Maintained by Gorka Navarrete. Last updated 11 months ago.
bayesian-inferencenegative-predictive-valuepositive-predictive-value
0.5 match 8 stars 5.38 score 15 scriptsviadee
localICE:Local Individual Conditional Expectation
Local Individual Conditional Expectation ('localICE') is a local explanation approach from the field of eXplainable Artificial Intelligence (XAI). localICE is a model-agnostic XAI approach which provides three-dimensional local explanations for particular data instances. The approach is proposed in the master thesis of Martin Walter as an extension to ICE (see Reference). The three dimensions are the two features at the horizontal and vertical axes as well as the target represented by different colors. The approach is applicable for classification and regression problems to explain interactions of two features towards the target. For classification models, the number of classes can be more than two and each class is added as a different color to the plot. The given instance is added to the plot as two dotted lines according to the feature values. The localICE-package can explain features of type factor and numeric of any machine learning model. Automatically supported machine learning packages are 'mlr', 'randomForest', 'caret' or all other with an S3 predict function. For further model types from other libraries, a predict function has to be provided as an argument in order to get access to the model. Reference to the ICE approach: Alex Goldstein, Adam Kapelner, Justin Bleich, Emil Pitkin (2013) <arXiv:1309.6392>.
Maintained by Martin Walter. Last updated 5 years ago.
aiexplainable-aiggplotmachine-learningvisualization
0.8 match 7 stars 3.54 score 3 scriptsathammad
pbox:Exploring Multivariate Spaces with Probability Boxes
Advanced statistical library offering a method to encapsulate and query the probability space of a dataset effortlessly using Probability Boxes (p-boxes). Its distinctive feature lies in the ease with which users can navigate and analyze marginal, joint, and conditional probabilities while taking into account the underlying correlation structure inherent in the data using copula theory and models. A comprehensive explanation is available in the paper "pbox: Exploring Multivariate Spaces with Probability Boxes" to be published in the Journal of Statistical Software.
Maintained by Ahmed T. Hammad. Last updated 8 months ago.
climate-changecopulaenvironmental-monitoringfinancial-analysisprobabilityrisk-assessmentrisk-managementstatistics
0.5 match 2 stars 5.04 score 4 scriptsmissiegobeats
OutliersLearn:Educational Outlier Package with Common Outlier Detection Algorithms
Provides implementations of some of the most important outlier detection algorithms. Includes a tutorial mode option that shows a description of each algorithm and provides a step-by-step execution explanation of how it identifies outliers from the given data with the specified input parameters. References include the works of Azzedine Boukerche, Lining Zheng, and Omar Alfandi (2020) <doi:10.1145/3381028>, Abir Smiti (2020) <doi:10.1016/j.cosrev.2020.100306>, and Xiaogang Su, Chih-Ling Tsai (2011) <doi:10.1002/widm.19>.
Maintained by Andres Missiego Manjon. Last updated 9 months ago.
0.5 match 1 stars 4.60 score 2 scriptsashipunov
shipunov:Miscellaneous Functions from Alexey Shipunov
A collection of functions for data manipulation, plotting and statistical computing, to use separately or with the book "Visual Statistics. Use R!": Shipunov (2020) <http://ashipunov.info/shipunov/software/r/r-en.htm>. Dr Alexey Shipunov died in December 2022. Most useful functions: Bclust(), Jclust() and BootA() which bootstrap hierarchical clustering; Recode() which does multiple recoding in a fast, simple and flexible way; Misclass() which outputs confusion matrix even if classes are not concerted; Overlap() which measures group separation on any projection; Biarrows() which converts any scatterplot into biplot; and Pleiad() which is fast and flexible correlogram.
Maintained by ORPHANED. Last updated 2 years ago.
2.3 match 1.00 score 9 scriptsyanrong-stacy-song
backtestGraphics:Interactive Graphics for Portfolio Data
Creates an interactive graphics interface to visualize backtest results of different financial instruments, such as equities, futures, and credit default swaps. The package does not run backtests on the given data set but displays a graphical explanation of the backtest results. Users can look at backtest graphics for different instruments, investment strategies, and portfolios. Summary statistics of different portfolio holdings are shown in the left panel, and interactive plots of profit and loss (P&L), net market value (NMV) and gross market value (GMV) are displayed in the right panel.
Maintained by Yanrong Song. Last updated 25 days ago.
0.5 match 4.40 score 4 scriptsface-it-project
FjordLight:Available Light Within the Water Column and on the Seafloor of Arctic Fjords
Satellite data collected between 2003 and 2022, in conjunction with gridded bathymetric data (50-150 m resolution), are used to estimate the irradiance reaching the bottom of a series of representative EU Arctic fjords. An Earth System Science Data (ESSD) manuscript, Schlegel et al. (2024), provides a detailed explanation of the methodology.
Maintained by Robert W. Schlegel. Last updated 7 months ago.
0.5 match 4.30 score 6 scriptsbioc
gmoviz:Seamless visualization of complex genomic variations in GMOs and edited cell lines
Genetically modified organisms (GMOs) and cell lines are widely used models in all kinds of biological research. As part of characterising these models, DNA sequencing technology and bioinformatics analyses are used systematically to study their genomes. Therefore, large volumes of data are generated and various algorithms are applied to analyse this data, which introduces a challenge on representing all findings in an informative and concise manner. `gmoviz` provides users with an easy way to visualise and facilitate the explanation of complex genomic editing events on a larger, biologically-relevant scale.
Maintained by Kathleen Zeglinski. Last updated 5 months ago.
visualizationsequencinggeneticvariabilitygenomicvariationcoverage
0.5 match 4.30 score 9 scriptspiarasfahey
cryptography:Encrypts and Decrypts Text Ciphers
Playfair, Four-Square, Scytale, Columnar Transposition and Autokey methods. Further explanation on methods of classical cryptography can be found at Wikipedia; (<https://en.wikipedia.org/wiki/Classical_cipher>).
Maintained by Piaras Fahey. Last updated 2 years ago.
0.5 match 2 stars 4.00 score 3 scriptslehmasve
hdflex:High-Dimensional Aggregate Density Forecasts
Provides a forecasting method that efficiently maps vast numbers of (scalar-valued) signals into an aggregate density forecast in a time-varying and computationally fast manner. The method proceeds in two steps: First, it transforms a predictive signal into a density forecast and, second, it combines the resulting candidate density forecasts into an ultimate aggregate density forecast. For a detailed explanation of the method, please refer to Adaemmer et al. (2023) <doi:10.2139/ssrn.4342487>.
Maintained by Sven Lehmann. Last updated 4 months ago.
ensemble-learningforecast-combinationforecastinghigh-dimensionalitytime-seriesopenblascppopenmp
0.5 match 3 stars 3.78 score 1 scriptstweedell
motoRneuron:Analyzing Paired Neuron Discharge Times for Time-Domain Synchronization
The temporal relationship between motor neurons can offer explanations for neural strategies. We combined functions to reduce neuron action potential discharge data and analyze it for short-term, time-domain synchronization. Even more so, motoRneuron combines most available methods for the determining cross correlation histogram peaks and most available indices for calculating synchronization into simple functions. See Nordstrom, Fuglevand, and Enoka (1992) <doi:10.1113/jphysiol.1992.sp019244> for a more thorough introduction.
Maintained by Andrew Tweedell. Last updated 6 years ago.
0.5 match 1 stars 3.74 score 11 scriptsjaviermtzrdz
tidydelta:Estimation of Standard Errors using Delta Method
Delta Method implementation to estimate standard errors with known asymptotic properties within the 'tidyverse' workflow. The Delta Method is a statistical tool that approximates an estimator’s behaviour using a Taylor Expansion. For a comprehensive explanation, please refer to Chapter 3 of van der Vaart (1998, ISBN: 9780511802256).
Maintained by Javier Martinez-Rodriguez. Last updated 8 months ago.
0.5 match 5 stars 3.70 score 3 scriptsundocumeantit
roxyPackage:Utilities to Automate Package Builds
The intention of this package is to make packaging R code as easy as possible. 'roxyPackage' uses tools from the 'roxygen2' package to generate documentation. It also automatically generates and updates files like *-package.R, DESCRIPTION, CITATION, ChangeLog and NEWS.Rd. Building packages supports source format, as well as several binary formats (MS Windows, Mac OS X, Debian GNU/Linux) if the package contains pure R code only. The packages built are stored in a fully functional local R package repository which can be synced to a web server to share them with others. This includes the generation of browsable HTML pages similar to CRAN, with support for RSS feeds from the ChangeLog. Please read the vignette for a more detailed explanation by example.
Maintained by m.eik michalke. Last updated 8 months ago.
0.5 match 11 stars 3.74 score 1 scriptscarloshellin
LearningRlab:Statistical Learning Functions
Aids in learning statistical functions incorporating the result of calculus done with each function and how they are obtained, that is, which equation and variables are used. Also for all these equations and their related variables detailed explanations and interactive exercises are also included. All these characteristics allow to the package user to improve the learning of statistics basics by means of their use.
Maintained by Carlos Javier Hellin Asensio. Last updated 2 years ago.
0.5 match 3.64 score 44 scriptscran
ciu:Contextual Importance and Utility
Implementation of the Contextual Importance and Utility (CIU) concepts for Explainable AI (XAI). A recent description of CIU can be found in e.g. Främling (2020) <arXiv:2009.13996>.
Maintained by Kary Främling. Last updated 2 years ago.
1.9 match 1.00 scoreo1iv3r
FeatureImpCluster:Feature Importance for Partitional Clustering
Implements a novel approach for measuring feature importance in k-means clustering. Importance of a feature is measured by the misclassification rate relative to the baseline cluster assignment due to a random permutation of feature values. An explanation of permutation feature importance in general can be found here: <https://christophm.github.io/interpretable-ml-book/feature-importance.html>.
Maintained by Oliver Pfaffel. Last updated 3 years ago.
0.5 match 4 stars 3.58 score 19 scriptsandriyprotsak5
UAHDataScienceSF:Interactive Statistical Learning Functions
An educational toolkit for learning statistical concepts through interactive exploration. Provides functions for basic statistics (mean, variance, etc.) and probability distributions with step-by-step explanations and interactive learning modes. Each function can be used for simple calculations, detailed learning with explanations, or interactive practice with feedback.
Maintained by Andriy Protsak Protsak. Last updated 26 days ago.
0.8 match 2.30 scoreandriyprotsak5
UAHDataScienceUC:Learn Clustering Techniques Through Examples and Code
A comprehensive educational package combining clustering algorithms with detailed step-by-step explanations. Provides implementations of both traditional (hierarchical, k-means) and modern (Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Gaussian Mixture Models (GMM), genetic k-means) clustering methods as described in Ezugwu et. al., (2022) <doi:10.1016/j.engappai.2022.104743>. Includes educational datasets highlighting different clustering challenges, based on 'scikit-learn' examples (Pedregosa et al., 2011) <https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html>. Features detailed algorithm explanations, visualizations, and weighted distance calculations for enhanced learning.
Maintained by Andriy Protsak Protsak. Last updated 26 days ago.
0.8 match 2.30 scorecran
UAHDataScienceSF:Interactive Statistical Learning Functions
An educational toolkit for learning statistical concepts through interactive exploration. Provides functions for basic statistics (mean, variance, etc.) and probability distributions with step-by-step explanations and interactive learning modes. Each function can be used for simple calculations, detailed learning with explanations, or interactive practice with feedback.
Maintained by Andriy Protsak Protsak. Last updated 26 days ago.
0.8 match 2.00 scoreyikeshu0611
nomogramFormula:Calculate Total Points and Probabilities for Nomogram
A nomogram, which can be carried out in 'rms' package, provides a graphical explanation of a prediction process. However, it is not very easy to draw straight lines, read points and probabilities accurately. Even, it is hard for users to calculate total points and probabilities for all subjects. This package provides formula_rd() and formula_lp() functions to fit the formula of total points with raw data and linear predictors respectively by polynomial regression. Function points_cal() will help you calculate the total points. prob_cal() can be used to calculate the probabilities after lrm(), cph() or psm() regression. For more complexed condition, interaction or restricted cubic spine, TotalPoints.rms() can be used.
Maintained by Jing Zhang. Last updated 5 years ago.
0.5 match 3.13 score 15 scripts 1 dependentsandriyprotsak5
UAHDataScienceSC:Learn Supervised Classification Methods Through Examples and Code
Supervised classification methods, which (if asked) can provide step-by-step explanations of the algorithms used, as described in PK Josephine et. al., (2021) <doi:10.59176/kjcs.v1i1.1259>; and datasets to test them on, which highlight the strengths and weaknesses of each technique.
Maintained by Andriy Protsak Protsak. Last updated 1 months ago.
0.5 match 3.00 scorecran
UAHDataScienceUC:Learn Clustering Techniques Through Examples and Code
A comprehensive educational package combining clustering algorithms with detailed step-by-step explanations. Provides implementations of both traditional (hierarchical, k-means) and modern (Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Gaussian Mixture Models (GMM), genetic k-means) clustering methods as described in Ezugwu et. al., (2022) <doi:10.1016/j.engappai.2022.104743>. Includes educational datasets highlighting different clustering challenges, based on 'scikit-learn' examples (Pedregosa et al., 2011) <https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html>. Features detailed algorithm explanations, visualizations, and weighted distance calculations for enhanced learning.
Maintained by Andriy Protsak Protsak. Last updated 26 days ago.
0.8 match 2.00 scoreandriyprotsak5
UAHDataScienceO:Educational Outlier Detection Algorithms with Step-by-Step Tutorials
Provides implementations of some of the most important outlier detection algorithms. Includes a tutorial mode option that shows a description of each algorithm and provides a step-by-step execution explanation of how it identifies outliers from the given data with the specified input parameters. References include the works of Azzedine Boukerche, Lining Zheng, and Omar Alfandi (2020) <doi:10.1145/3381028>, Abir Smiti (2020) <doi:10.1016/j.cosrev.2020.100306>, and Xiaogang Su, Chih-Ling Tsai (2011) <doi:10.1002/widm.19>.
Maintained by Andriy Protsak Protsak. Last updated 1 months ago.
0.5 match 3.00 scoreugranziol
appRiori:Code and Obtain Customized Planned Comparisons with 'appRiori'
With 'appRiori' <doi:10.1177/25152459241293110>, users upload the research variables and the app guides them to the best set of comparisons fitting the hypotheses, for both main and interaction effects. Through a graphical explanation and empirical examples on reproducible data, it is shown that it is possible to understand both the logic behind the planned comparisons and the way to interpret them when a model is tested.
Maintained by Umberto Granziol. Last updated 17 days ago.
0.5 match 2 stars 2.90 score 7 scriptscomiseng
LearnSL:Learn Supervised Classification Methods Through Examples and Code
Supervised classification methods, which (if asked) can provide step-by-step explanations of the algorithms used, as described in PK Josephine et. al., (2021) <doi:10.59176/kjcs.v1i1.1259>; and datasets to test them on, which highlight the strengths and weaknesses of each technique.
Maintained by Víctor Amador Padilla. Last updated 1 years ago.
0.5 match 2.70 score 1 scriptsediu3095
clustlearn:Learn Clustering Techniques Through Examples and Code
Clustering methods, which (if asked) can provide step-by-step explanations of the algorithms used, as described in Ezugwu et. al., (2022) <doi:10.1016/j.engappai.2022.104743>; and datasets to test them on, which highlight the strengths and weaknesses of each technique, as presented in the clustering section of 'scikit-learn' (Pedregosa et al., 2011) <https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html>.
Maintained by Eduardo Ruiz Sabajanes. Last updated 2 years ago.
0.5 match 1 stars 2.70 score 4 scriptsthijsjanzen
nodeSub:Simulate DNA Alignments Using Node Substitutions
Simulate DNA sequences for the node substitution model. In the node substitution model, substitutions accumulate additionally during a speciation event, providing a potential mechanistic explanation for substitution rate variation. This package provides tools to simulate such a process, simulate a reference process with only substitutions along the branches, and provides tools to infer phylogenies from alignments. More information can be found in Janzen (2021) <doi:10.1093/sysbio/syab085>.
Maintained by Thijs Janzen. Last updated 1 years ago.
0.5 match 1 stars 2.70 score 3 scriptsijun2018
MLEce:Asymptotic Efficient Closed-Form Estimators for Multivariate Distributions
Asymptotic efficient closed-form estimators (MLEces) are provided in this package for three multivariate distributions(gamma, Weibull and Dirichlet) whose maximum likelihood estimators (MLEs) are not in closed forms. Closed-form estimators are strong consistent, and have the similar asymptotic normal distribution like MLEs. But the calculation of MLEces are much faster than the corresponding MLEs. Further details and explanations of MLEces can be found in. Jang, et al. (2023) <doi:10.1111/stan.12299>. Kim, et al. (2023) <doi:10.1080/03610926.2023.2179880>.
Maintained by Jun Zhao. Last updated 1 years ago.
0.5 match 2.70 scoresigbertklinke
exams.forge:Support for Compiling Examination Tasks using the 'exams' Package
The main aim is to further facilitate the creation of exercises based on the package 'exams' by Grün, B., and Zeileis, A. (2009) <doi:10.18637/jss.v029.i10>. Creating effective student exercises involves challenges such as creating appropriate data sets and ensuring access to intermediate values for accurate explanation of solutions. The functionality includes the generation of univariate and bivariate data including simple time series, functions for theoretical distributions and their approximation, statistical and mathematical calculations for tasks in basic statistics courses as well as general tasks such as string manipulation, LaTeX/HTML formatting and the editing of XML task files for 'Moodle'.
Maintained by Sigbert Klinke. Last updated 8 months ago.
0.5 match 2.70 score 1 scriptssangkyustat
EMSS:Some EM-Type Estimation Methods for the Heckman Selection Model
Some EM-type algorithms to estimate parameters for the well-known Heckman selection model are provided in the package. Such algorithms are as follow: ECM(Expectation/Conditional Maximization), ECM(NR)(the Newton-Raphson method is adapted to the ECM) and ECME(Expectation/Conditional Maximization Either). Since the algorithms are based on the EM algorithm, they also have EM’s main advantages, namely, stability and ease of implementation. Further details and explanations of the algorithms can be found in Zhao et al. (2020) <doi: 10.1016/j.csda.2020.106930>.
Maintained by Sang Kyu Lee. Last updated 3 years ago.
0.5 match 2.48 score 1 scriptscran
UAHDataScienceSC:Learn Supervised Classification Methods Through Examples and Code
Supervised classification methods, which (if asked) can provide step-by-step explanations of the algorithms used, as described in PK Josephine et. al., (2021) <doi:10.59176/kjcs.v1i1.1259>; and datasets to test them on, which highlight the strengths and weaknesses of each technique.
Maintained by Andriy Protsak Protsak. Last updated 27 days ago.
0.5 match 2.00 scoreaflavus
FourgameteP:FourGamete Package
The four-gamete test is based on the infinite-sites model which assumes that the probability of the same mutation occurring twice (recurrent or parallel mutations) and the probability of a mutation back to the original state (reverse mutations) are close to zero. Without these types of mutations, the only explanation for observing the four dilocus genotypes (example below) is recombination (Hudson and Kaplan 1985, Genetics 111:147-164). Thus, the presence of all four gametes is also called phylogenetic incompatibility.
Maintained by Milton T Drott. Last updated 7 years ago.
0.5 match 2.00 score 1 scriptscran
UAHDataScienceO:Educational Outlier Detection Algorithms with Step-by-Step Tutorials
Provides implementations of some of the most important outlier detection algorithms. Includes a tutorial mode option that shows a description of each algorithm and provides a step-by-step execution explanation of how it identifies outliers from the given data with the specified input parameters. References include the works of Azzedine Boukerche, Lining Zheng, and Omar Alfandi (2020) <doi:10.1145/3381028>, Abir Smiti (2020) <doi:10.1016/j.cosrev.2020.100306>, and Xiaogang Su, Chih-Ling Tsai (2011) <doi:10.1002/widm.19>.
Maintained by Andriy Protsak Protsak. Last updated 23 days ago.
0.5 match 2.00 scorejacqueline-98
mergenstudio:'Mergen' Studio: An 'RStudio' Addin Wrapper for the 'Mergen' Package
An 'RStudio' Addin wrapper for the 'mergen' package. This package employs artificial intelligence to convert data analysis questions into executable code, explanations, and algorithms. This package makes it easier to use Large Language Models in your development environment by providing a chat-like interface, while also allowing you to inspect and execute the returned code.
Maintained by Jacqueline Jansen. Last updated 8 months ago.
0.5 match 1.70 score 1 scriptskingqwert
GWLelast:Geographically Weighted Logistic Elastic Net Regression
Fit a geographically weighted logistic elastic net regression. Detailed explanations can be found in Yoneoka et al. (2016): New algorithm for constructing area-based index with geographical heterogeneities and variable selection: An application to gastric cancer screening <doi:10.1038/srep26582>.
Maintained by Daisuke Yoneoka. Last updated 6 years ago.
0.5 match 3 stars 1.48 score 1 scriptsdnychka
RadioSonde:Tools for Plotting Skew-T Diagrams and Wind Profiles
A collection of programs for plotting SKEW-T,log p diagrams and wind profiles for data collected by radiosondes (the typical weather balloon-borne instrument). The format of this plot with companion lines to assess atmospheric stability are both standard in meteorology and difficult to create from basic graphics functions. Hence this package. One novel feature is being able add several profiles to the same plot for comparison. Use "help(ExampleSonde)" for an explanation of the variables needed and how they should be named in a data frame. See <https://github.com/dnychka/Radiosonde> for the package home page.
Maintained by Doug Nychka. Last updated 3 years ago.
0.5 match 1.15 score 14 scriptscran
jage:Estimation of Developmental Age
Bayesian methods for estimating developmental age from ordinal dental data. For an explanation of the model used, see Konigsberg (2015) <doi:10.3109/03014460.2015.1045430>. For details on the conditional correlation correction, see Sgheiza (2022) <doi:10.1016/j.forsciint.2021.111135>. Dental scoring is based on Moorrees, Fanning, and Hunt (1963) <doi:10.1177/00220345630420062701>.
Maintained by Valerie Sgheiza. Last updated 1 years ago.
0.5 match 1.00 scoreadamtclark
partitionBEFsp:Methods for Calculating the Loreau & Hector 2001 BEF Partition
A collection of functions that can be used to estimate selection and complementarity effects, sensu Loreau & Hector (2001) <doi:10.1038/35083573>, even in cases where data are only available for a random subset of species (i.e. incomplete sample-level data). A full derivation and explanation of the statistical corrections used here is available in Clark et al. (2019) <doi:10.1111/2041-210X.13285>.
Maintained by Adam Clark. Last updated 6 years ago.
0.5 match 1.00 score 2 scriptscran
logitFD:Functional Principal Components Logistic Regression
Functions for fitting a functional principal components logit regression model in four different situations: ordinary and filtered functional principal components of functional predictors, included in the model according to their variability explanation power, and according to their prediction ability by stepwise methods. The proposed methods were developed in Escabias et al (2004) <doi:10.1080/10485250310001624738> and Escabias et al (2005) <doi:10.1016/j.csda.2005.03.011>.
Maintained by Manuel Escabias. Last updated 3 years ago.
0.5 match 1.00 score 2 scriptscran
washeR:Time Series Outlier Detection
Time series outlier detection with non parametric test. This is a new outlier detection methodology (washer): efficient for time saving elaboration and implementation procedures, adaptable for general assumptions and for needing very short time series, reliable and effective as involving robust non parametric test. You can find two approaches: single time series (a vector) and grouped time series (a data frame). For other informations: Andrea Venturini (2011) Statistica - Universita di Bologna, Vol.71, pp.329-344. For an informal explanation look at R-bloggers on web.
Maintained by Andrea Venturini. Last updated 2 years ago.
0.5 match 1.00 scoreeliebs
depcoeff:Dependency Coefficients
Functions to compute coefficients measuring the dependence of two or more than two variables. The functions can be deployed to gain information about functional dependencies of the variables with emphasis on monotone functions. The statistics describe how well one response variable can be approximated by a monotone function of other variables. In regression analysis the variable selection is an important issue. In this framework the functions could be useful tools in modeling the regression function. Detailed explanations on the subject can be found in papers Liebscher (2014) <doi:10.2478/demo-2014-0004>; Liebscher (2017) <doi:10.1515/demo-2017-0012>; Liebscher (2019, submitted).
Maintained by Eckhard Liebscher. Last updated 5 years ago.
0.5 match 1.00 score