Showing 26 of total 26 results (show query)
paulnorthrop
anscombiser:Create Datasets with Identical Summary Statistics
Anscombe's quartet are a set of four two-variable datasets that have several common summary statistics but which have very different joint distributions. This becomes apparent when the data are plotted, which illustrates the importance of using graphical displays in Statistics. This package enables the creation of datasets that have identical marginal sample means and sample variances, sample correlation, least squares regression coefficients and coefficient of determination. The user supplies an initial dataset, which is shifted, scaled and rotated in order to achieve target summary statistics. The general shape of the initial dataset is retained. The target statistics can be supplied directly or calculated based on a user-supplied dataset. The 'datasauRus' package <https://cran.r-project.org/package=datasauRus> provides further examples of datasets that have markedly different scatter plots but share many sample summary statistics.
Maintained by Paul J. Northrop. Last updated 2 years ago.
anscombeanscombes-quartetanscombesquartet
36.8 match 11 stars 4.74 score 9 scriptsr-causal
quartets:Datasets to Help Teach Statistics
In the spirit of Anscombe's quartet, this package includes datasets that demonstrate the importance of visualizing your data, the importance of not relying on statistical summary measures alone, and why additional assumptions about the data generating mechanism are needed when estimating causal effects. The package includes "Anscombe's Quartet" (Anscombe 1973) <doi:10.1080/00031305.1973.10478966>, D'Agostino McGowan & Barrett (2023) "Causal Quartet" <doi:10.1080/26939169.2023.2276446>, "Datasaurus Dozen" (Matejka & Fitzmaurice 2017), "Interaction Triptych" (Rohrer & Arslan 2021) <doi:10.1177/25152459211007368>, "Rashomon Quartet" (Biecek et al. 2023) <doi:10.48550/arXiv.2302.13356>, and Gelman "Variation and Heterogeneity Causal Quartets" (Gelman et al. 2023) <doi:10.48550/arXiv.2302.12878>.
Maintained by Lucy DAgostino McGowan. Last updated 1 years ago.
19.8 match 42 stars 5.75 score 27 scriptsluqqe
moments:Moments, Cumulants, Skewness, Kurtosis and Related Tests
Functions to calculate: moments, Pearson's kurtosis, Geary's kurtosis and skewness; tests related to them (Anscombe-Glynn, D'Agostino, Bonett-Seier).
Maintained by Lukasz Komsta. Last updated 3 years ago.
4.7 match 2 stars 9.34 score 4.8k scripts 123 dependentskenaho1
asbio:A Collection of Statistical Tools for Biologists
Contains functions from: Aho, K. (2014) Foundational and Applied Statistics for Biologists using R. CRC/Taylor and Francis, Boca Raton, FL, ISBN: 978-1-4398-7338-0.
Maintained by Ken Aho. Last updated 2 months ago.
4.5 match 5 stars 7.32 score 310 scripts 3 dependentscran
binhf:Haar-Fisz Functions for Binomial Data
Binomial Haar-Fisz transforms for Gaussianization as in Nunes and Nason (2009).
Maintained by Matt Nunes. Last updated 7 years ago.
8.0 match 3.85 score 3 dependentsdcousin3
ANOPA:Analyses of Proportions using Anscombe Transform
Analyses of Proportions can be performed on the Anscombe (arcsine-related) transformed data. The 'ANOPA' package can analyze proportions obtained from up to four factors. The factors can be within-subject or between-subject or a mix of within- and between-subject. The main, omnibus analysis can be followed by additive decompositions into interaction effects, main effects, simple effects, contrast effects, etc., mimicking precisely the logic of ANOVA. For that reason, we call this set of tools 'ANOPA' (Analysis of Proportion using Anscombe transform) to highlight its similarities with ANOVA. The 'ANOPA' framework also allows plots of proportions easy to obtain along with confidence intervals. Finally, effect sizes and planning statistical power are easily done under this framework. Only particularity, the 'ANOPA' computes F statistics which have an infinite degree of freedom on the denominator. See Laurencelle and Cousineau (2023) <doi:10.3389/fpsyg.2022.1045436>.
Maintained by Denis Cousineau. Last updated 2 months ago.
error-barsproportionsstatistical-testingstatisticssummary-statistics
7.2 match 1 stars 3.65 score 18 scriptsmmaechler
sfsmisc:Utilities from 'Seminar fuer Statistik' ETH Zurich
Useful utilities ['goodies'] from Seminar fuer Statistik ETH Zurich, some of which were ported from S-plus in the 1990s. For graphics, have pretty (Log-scale) axes eaxis(), an enhanced Tukey-Anscombe plot, combining histogram and boxplot, 2d-residual plots, a 'tachoPlot()', pretty arrows, etc. For robustness, have a robust F test and robust range(). For system support, notably on Linux, provides 'Sys.*()' functions with more access to system and CPU information. Finally, miscellaneous utilities such as simple efficient prime numbers, integer codes, Duplicated(), toLatex.numeric() and is.whole().
Maintained by Martin Maechler. Last updated 5 months ago.
2.2 match 11 stars 10.87 score 566 scripts 119 dependentssvmiller
stevedata:Steve's Toy Data for Teaching About a Variety of Methodological, Social, and Political Topics
This is a collection of various kinds of data with broad uses for teaching. My students, and academics like me who teach the same topics I teach, should find this useful if their teaching workflow is also built around the R programming language. The applications are multiple but mostly cluster on topics of statistical methodology, international relations, and political economy.
Maintained by Steve Miller. Last updated 4 days ago.
4.0 match 8 stars 5.97 score 178 scriptsnumbats
cassowaryr:Compute Scagnostics on Pairs of Numeric Variables in a Data Set
Computes a range of scatterplot diagnostics (scagnostics) on pairs of numerical variables in a data set. A range of scagnostics, including graph and association-based scagnostics described by Leland Wilkinson and Graham Wills (2008) <doi:10.1198/106186008X320465> and association-based scagnostics described by Katrin Grimm (2016,ISBN:978-3-8439-3092-5) can be computed. Summary and plotting functions are provided.
Maintained by Harriet Mason. Last updated 12 days ago.
data-sciencedata-visualizationedahigh-dimensional-datamultivariate
3.5 match 3 stars 6.02 score 26 scripts 1 dependentsstephenturner
Tmisc:Turner Miscellaneous
Miscellaneous utility functions for data manipulation, data tidying, and working with gene expression data and biological sequence data.
Maintained by Stephen Turner. Last updated 11 months ago.
3.8 match 2 stars 5.44 score 174 scripts 1 dependentsovgu-sh
desk:Didactic Econometrics Starter Kit
Written to help undergraduate as well as graduate students to get started with R for basic econometrics without the need to import specific functions and datasets from many different sources. Primarily, the package is meant to accompany the German textbook Auer, L.v., Hoffmann, S., Kranz, T. (2024, ISBN: 978-3-662-68263-0) from which the exercises cover all the topics from the textbook Auer, L.v. (2023, ISBN: 978-3-658-42699-6).
Maintained by Soenke Hoffmann. Last updated 11 months ago.
4.5 match 4.30 score 10 scriptsbioc
CSSQ:Chip-seq Signal Quantifier Pipeline
This package is desgined to perform statistical analysis to identify statistically significant differentially bound regions between multiple groups of ChIP-seq dataset.
Maintained by Fan Lab at Georgia Institute of Technology. Last updated 5 months ago.
chipseqdifferentialpeakcallingsequencingnormalization
3.5 match 4.00 score 1 scriptshneth
ds4psy:Data Science for Psychologists
All datasets and functions required for the examples and exercises of the book "Data Science for Psychologists" (by Hansjoerg Neth, Konstanz University, 2023), freely available at <https://bookdown.org/hneth/ds4psy/>. The book and course introduce principles and methods of data science to students of psychology and other biological or social sciences. The 'ds4psy' package primarily provides datasets, but also functions for data generation and manipulation (e.g., of text and time data) and graphics that are used in the book and its exercises. All functions included in 'ds4psy' are designed to be explicit and instructive, rather than efficient or elegant.
Maintained by Hansjoerg Neth. Last updated 1 months ago.
data-literacydata-scienceeducationexploratory-data-analysispsychologysocial-sciencesvisualisation
1.7 match 22 stars 6.79 score 70 scriptscseljatib
datana:Datasets and Functions to Accompany Analisis De Datos Con R
Datasets and functions to accompany the book 'Analisis de datos con el programa estadistico R: una introduccion aplicada' by Salas-Eljatib (2021, ISBN: 9789566086109). The package helps carry out data management, exploratory analyses, and model fitting.
Maintained by Christian Salas-Eljatib. Last updated 6 months ago.
6.8 match 1.30 score 1 scriptsr-forge
ROptEst:Optimally Robust Estimation
R infrastructure for optimally robust estimation in general smoothly parameterized models using S4 classes and methods as described Kohl, M., Ruckdeschel, P., and Rieder, H. (2010), <doi:10.1007/s10260-010-0133-0>, and in Rieder, H., Kohl, M., and Ruckdeschel, P. (2008), <doi:10.1007/s10260-007-0047-7>.
Maintained by Matthias Kohl. Last updated 2 months ago.
2.0 match 4.26 score 50 scripts 1 dependentssahirbhatnagar
ggmix:Variable Selection in Linear Mixed Models for SNP Data
Fit penalized multivariable linear mixed models with a single random effect to control for population structure in genetic association studies. The goal is to simultaneously fit many genetic variants at the same time, in order to select markers that are independently associated with the response. Can also handle prior annotation information, for example, rare variants, in the form of variable weights. For more information, see the website below and the accompanying paper: Bhatnagar et al., "Simultaneous SNP selection and adjustment for population structure in high dimensional prediction models", 2020, <DOI:10.1371/journal.pgen.1008766>.
Maintained by Sahir Bhatnagar. Last updated 4 years ago.
1.3 match 10 stars 5.48 score 20 scriptsericbarba
ExpDes.pt:Pacote Experimental Designs (Portugues)
Pacote para análise de delineamentos experimentais (DIC, DBC e DQL), experimentos em esquema fatorial duplo (em DIC e DBC), experimentos em parcelas subdivididas (em DIC e DBC), experimentos em esquema fatorial duplo com um tratamento adicional (em DIC e DBC), experimentos em fatorial triplo (em DIC e DBC) e experimentos em esquema fatorial triplo com um tratamento adicional (em DIC e DBC), fazendo analise de variancia e comparacao de multiplas medias (para tratamentos qualitativos), ou ajustando modelos de regressao ate a terceira potencia (para tratamentos quantitativos); analise de residuos (Ferreira, Cavalcanti and Nogueira, 2014) <doi:10.4236/am.2014.519280>.
Maintained by Eric Batista Ferreira. Last updated 3 years ago.
1.7 match 3.52 score 232 scriptsbb-diesunddas
multivariance:Measuring Multivariate Dependence Using Distance Multivariance
Distance multivariance is a measure of dependence which can be used to detect and quantify dependence of arbitrarily many random vectors. The necessary functions are implemented in this packages and examples are given. It includes: distance multivariance, distance multicorrelation, dependence structure detection, tests of independence and copula versions of distance multivariance based on the Monte Carlo empirical transform. Detailed references are given in the package description, as starting point for the theoretic background we refer to: B. Böttcher, Dependence and Dependence Structures: Estimation and Visualization Using the Unifying Concept of Distance Multivariance. Open Statistics, Vol. 1, No. 1 (2020), <doi:10.1515/stat-2020-0001>.
Maintained by Björn Böttcher. Last updated 3 years ago.
4.0 match 1 stars 1.36 score 23 scriptsericbarba
ExpDes:Experimental Designs Package
Package for analysis of simple experimental designs (CRD, RBD and LSD), experiments in double factorial schemes (in CRD and RBD), experiments in a split plot in time schemes (in CRD and RBD), experiments in double factorial schemes with an additional treatment (in CRD and RBD), experiments in triple factorial scheme (in CRD and RBD) and experiments in triple factorial schemes with an additional treatment (in CRD and RBD), performing the analysis of variance and means comparison by fitting regression models until the third power (quantitative treatments) or by a multiple comparison test, Tukey test, test of Student-Newman-Keuls (SNK), Scott-Knott, Duncan test, t test (LSD) and Bonferroni t test (protected LSD) - for qualitative treatments; residual analysis (Ferreira, Cavalcanti and Nogueira, 2014) <doi:10.4236/am.2014.519280>.
Maintained by Eric Batista Ferreira. Last updated 3 years ago.
1.8 match 1 stars 2.86 score 73 scriptsdaphnaharel
sur:Companion to "Statistics Using R: An Integrative Approach"
Access to the datasets and many of the functions used in "Statistics Using R: An Integrative Approach". These datasets include a subset of the National Education Longitudinal Study, the Framingham Heart Study, as well as several simulated datasets used in the examples throughout the textbook. The functions included in the package reproduce some of the functionality of 'Stata' that is not directly available in 'R'. The package also contains a tutorial on basic data frame management, including how to handle missing data.
Maintained by Daphna Harel. Last updated 5 years ago.
4.0 match 1.26 score 18 scriptsjmcurran
dafs:Data Analysis for Forensic Scientists
Data and miscellanea to support the book "Introduction to Data analysis with R for Forensic Scientists." This book was written by James Curran and published by CRC Press in 2010 (ISBN: 978-1-4200-8826-7).
Maintained by James Curran. Last updated 3 years ago.
4.5 match 1 stars 1.08 score 12 scriptschandlerxiandeyang
CleaningValidation:Cleaning Validation Functions for Pharmaceutical Cleaning Process
Provides essential Cleaning Validation functions for complying with pharmaceutical cleaning process regulatory standards. The package includes non-parametric methods to analyze drug active-ingredient residue (DAR), cleaning agent residue (CAR), and microbial colonies (Mic) for non-Poisson distributions. Additionally, Poisson methods are provided for Mic analysis when Mic data follow a Poisson distribution.
Maintained by Xiande Yang. Last updated 10 months ago.
1.8 match 2.70 score