Showing 14 of total 14 results (show query)
indrajeetpatil
ggstatsplot:'ggplot2' Based Plots with Statistical Details
Extension of 'ggplot2', 'ggstatsplot' creates graphics with details from statistical tests included in the plots themselves. It provides an easier syntax to generate information-rich plots for statistical analysis of continuous (violin plots, scatterplots, histograms, dot plots, dot-and-whisker plots) or categorical (pie and bar charts) data. Currently, it supports the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian versions of t-test/ANOVA, correlation analyses, contingency table analysis, meta-analysis, and regression analyses. References: Patil (2021) <doi:10.21105/joss.03236>.
Maintained by Indrajeet Patil. Last updated 1 months ago.
bayes-factorsdatasciencedatavizeffect-sizeggplot-extensionhypothesis-testingnon-parametric-statisticsregression-modelsstatistical-analysis
2.1k stars 14.46 score 3.0k scripts 1 dependentseasystats
easystats:Framework for Easy Statistical Modeling, Visualization, and Reporting
A meta-package that installs and loads a set of packages from 'easystats' ecosystem in a single step. This collection of packages provide a unifying and consistent framework for statistical modeling, visualization, and reporting. Additionally, it provides articles targeted at instructors for teaching 'easystats', and a dashboard targeted at new R users for easily conducting statistical analysis by accessing summary results, model fit indices, and visualizations with minimal programming.
Maintained by Daniel Lüdecke. Last updated 27 days ago.
dataanalyticsdatascienceeasystatshacktoberfestmodelsperformance-metricsregression-modelsstatistics
1.1k stars 13.01 score 1.8k scripts 1 dependentsjthomasmock
gtExtras:Extending 'gt' for Beautiful HTML Tables
Provides additional functions for creating beautiful tables with 'gt'. The functions are generally wrappers around boilerplate or adding opinionated niche capabilities and helpers functions.
Maintained by Thomas Mock. Last updated 12 months ago.
data-sciencedata-visualizationdatascienceggplot2gtplotssparklinesparkline-graphssparklinestables
201 stars 11.66 score 2.4k scripts 5 dependentsmajkamichal
naivebayes:High Performance Implementation of the Naive Bayes Algorithm
In this implementation of the Naive Bayes classifier following class conditional distributions are available: 'Bernoulli', 'Categorical', 'Gaussian', 'Poisson', 'Multinomial' and non-parametric representation of the class conditional density estimated via Kernel Density Estimation. Implemented classifiers handle missing data and can take advantage of sparse data.
Maintained by Michal Majka. Last updated 2 months ago.
classification-modeldatasciencemachine-learningnaive-bayes
37 stars 10.47 score 1.0k scripts 6 dependentsmarkvanderloo
lumberjack:Track Changes in Data
A framework that allows for easy logging of changes in data. Main features: start tracking changes by adding a single line of code to an existing script. Track changes in multiple datasets, using multiple loggers. Add custom-built loggers or use loggers offered by other packages. <doi:10.18637/jss.v098.i01>.
Maintained by Mark van der Loo. Last updated 10 months ago.
daffdatascienceloggingreproducible-research
66 stars 7.13 score 68 scripts 1 dependentsnalzok
tree.interpreter:Random Forest Prediction Decomposition and Feature Importance Measure
An R re-implementation of the 'treeinterpreter' package on PyPI <https://pypi.org/project/treeinterpreter/>. Each prediction can be decomposed as 'prediction = bias + feature_1_contribution + ... + feature_n_contribution'. This decomposition is then used to calculate the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using out-of-bag samples (MDI-oob) feature importance measures based on the work of Li et al. (2019) <arXiv:1906.10845>.
Maintained by Qingyao Sun. Last updated 5 years ago.
data-sciencedatascienceinterpretabilitymachine-learningrandom-forestcpp
12 stars 5.78 score 6 scriptswinvector
RcppDynProg:'Rcpp' Dynamic Programming
Dynamic Programming implemented in 'Rcpp'. Includes example partition and out of sample fitting applications. Also supplies additional custom coders for the 'vtreat' package.
Maintained by John Mount. Last updated 2 years ago.
15 stars 5.61 score 18 scriptsarbuzovv
rusquant:Quantitative Trading Framework
Collection of functions to retrieve financial data from various sources, including brokerage and exchange platforms, financial websites, and data providers. Includes functions to retrieve account information, portfolio information, and place/cancel orders from different brokers. Additionally, allows users to download historical data such as earnings, dividends, stock splits.
Maintained by Vyacheslav Arbuzov. Last updated 10 months ago.
cryptocurrencydata-sciencedatasciencedatascrapingdatasetdatasourcedividendsearningsfinamfinanceinvestinginvesting-apiipoquantquantitative-financesplitsstockstrading
46 stars 5.51 score 47 scriptsgagolews
genie:Fast, Robust, and Outlier Resistant Hierarchical Clustering
Includes the reference implementation of Genie - a hierarchical clustering algorithm that links two point groups in such a way that an inequity measure (namely, the Gini index) of the cluster sizes does not significantly increase above a given threshold. This method most often outperforms many other data segmentation approaches in terms of clustering quality as tested on a wide range of benchmark datasets. At the same time, Genie retains the high speed of the single linkage approach, therefore it is also suitable for analysing larger data sets. For more details see (Gagolewski et al. 2016 <DOI:10.1016/j.ins.2016.05.003>). For an even faster and more feature-rich implementation, including, amongst others, noise point detection, see the 'genieclust' package (Gagolewski, 2021 <DOI:10.1016/j.softx.2021.100722>).
Maintained by Marek Gagolewski. Last updated 3 years ago.
clustercluster-analysisclusteringdata-analysisdata-miningdata-sciencedatasciencegeniehierarchical-clustering-algorithmmachine-learningmachine-learning-algorithmsoutlierscppopenmp
22 stars 4.55 score 16 scriptsmightymetrika
npboottprm:Nonparametric Bootstrap Test with Pooled Resampling
Addressing crucial research questions often necessitates a small sample size due to factors such as distinctive target populations, rarity of the event under study, time and cost constraints, ethical concerns, or group-level unit of analysis. Many readily available analytic methods, however, do not accommodate small sample sizes, and the choice of the best method can be unclear. The 'npboottprm' package enables the execution of nonparametric bootstrap tests with pooled resampling to help fill this gap. Grounded in the statistical methods for small sample size studies detailed in Dwivedi, Mallawaarachchi, and Alvarado (2017) <doi:10.1002/sim.7263>, the package facilitates a range of statistical tests, encompassing independent t-tests, paired t-tests, and one-way Analysis of Variance (ANOVA) F-tests. The nonparboot() function undertakes essential computations, yielding detailed outputs which include test statistics, effect sizes, confidence intervals, and bootstrap distributions. Further, 'npboottprm' incorporates an interactive 'shiny' web application, nonparboot_app(), offering intuitive, user-friendly data exploration.
Maintained by Mackson Ncube. Last updated 6 months ago.
datasciencenonparametricstatistics
1 stars 4.26 score 5 scripts 2 dependentsbradlindblad
cheatsheet:Download R Cheat Sheets Locally
A simple package to grab cheat sheets and save them to your local computer.
Maintained by Brad Lindblad. Last updated 2 years ago.
11 stars 3.74 score 5 scriptsropensci
cRegulome:Obtain and Visualize Regulome-Gene Expression Correlations in Cancer
Builds a 'SQLite' database file of pre-calculated transcription factor/microRNA-gene correlations (co-expression) in cancer from the Cistrome Cancer Liu et al. (2011) <doi:10.1186/gb-2011-12-8-r83> and 'miRCancerdb' databases (in press). Provides custom classes and functions to query, tidy and plot the correlation data.
Maintained by Mahmoud Ahmed. Last updated 5 years ago.
cancer-genomicsdatabasedatasciencemicrornapeer-reviewedtcga-datatranscription-factors
3 stars 3.69 score 54 scriptschris-prener
testDriveR:Teaching Data for Statistics and Data Science
Provides data sets for teaching statistics and data science courses. It includes a sample of data from John Edmund Kerrich's famous coinflip experiment. These are data that I used for statistics. The package also contains an R Markdown template with the required formatting for assignments in my former courses.
Maintained by Christopher Prener. Last updated 2 months ago.
2 stars 3.30 score 8 scriptsjohnmackintosh
metallicaRt:Colour palettes based on Metallica studio album covers
Colour palettes based on Metallica studio album covers.
Maintained by John MacKintosh. Last updated 1 years ago.
colour-palettesdata-visualisationdata-visualizationdatascienceggplot2ggplot2-themesmetallica
19 stars 2.98 score 3 scripts