Showing 200 of total 438 results (show query)

alanarnholt

BSDA:Basic Statistics and Data Analysis

Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.

Maintained by Alan T. Arnholt. Last updated 2 years ago.

27.1 match 7 stars 9.11 score 1.3k scripts 6 dependents

r-lib

gh:'GitHub' 'API'

Minimal client to access the 'GitHub' 'API'.

Maintained by Gábor Csárdi. Last updated 1 months ago.

githubgithub-api

4.8 match 224 stars 15.55 score 444 scripts 401 dependents

mwheymans

psfmi:Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets

Pooling, backward and forward selection of linear, logistic and Cox regression models in multiply imputed datasets. Backward and forward selection can be done from the pooled model using Rubin's Rules (RR), the D1, D2, D3, D4 and the median p-values method. This is also possible for Mixed models. The models can contain continuous, dichotomous, categorical and restricted cubic spline predictors and interaction terms between all these type of predictors. The stability of the models can be evaluated using (cluster) bootstrapping. The package further contains functions to pool model performance measures as ROC/AUC, Reclassification, R-squared, scaled Brier score, H&L test and calibration plots for logistic regression models. Internal validation can be done across multiply imputed datasets with cross-validation or bootstrapping. The adjusted intercept after shrinkage of pooled regression coefficients can be obtained. Backward and forward selection as part of internal validation is possible. A function to externally validate logistic prediction models in multiple imputed datasets is available and a function to compare models. For Cox models a strata variable can be included. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>.

Maintained by Martijn Heymans. Last updated 2 years ago.

cox-regressionimputationimputed-datasetslogisticmultiple-imputationpoolpredictorregressionselectionsplinespline-predictors

10.0 match 10 stars 7.17 score 70 scripts

jl5000

tidyged:Handle GEDCOM Files Using Tidyverse Principles

Create and summarise family tree GEDCOM files using tidy dataframes.

Maintained by Jamie Lendrum. Last updated 3 years ago.

8.9 match 8 stars 5.96 score 23 scripts 3 dependents

dexter-psychometrics

dexter:Data Management and Analysis of Tests

A system for the management, assessment, and psychometric analysis of data from educational and psychological tests.

Maintained by Jesse Koops. Last updated 6 days ago.

openblascppopenmp

4.1 match 8 stars 8.97 score 135 scripts 2 dependents

mrcaseb

personalr:Automated Personal Package Setup

Functions to setup a personal R package that attaches given libraries and exports personal helper functions.

Maintained by Sebastian Carl. Last updated 3 years ago.

7.9 match 13 stars 3.81 score 1 scripts

jimbrig

jimstools:Tools for R

What the package does (one paragraph).

Maintained by Jimmy Briggs. Last updated 3 years ago.

functionspersonalutility

10.0 match 2 stars 3.00 score 2 scripts

cran

tcl:Testing in Conditional Likelihood Context

An implementation of hypothesis testing in an extended Rasch modeling framework, including sample size planning procedures and power computations. Provides 4 statistical tests, i.e., gradient test (GR), likelihood ratio test (LR), Rao score or Lagrange multiplier test (RS), and Wald test, for testing a number of hypotheses referring to the Rasch model (RM), linear logistic test model (LLTM), rating scale model (RSM), and partial credit model (PCM). Three types of functions for power and sample size computations are provided. Firstly, functions to compute the sample size given a user-specified (predetermined) deviation from the hypothesis to be tested, the level alpha, and the power of the test. Secondly, functions to evaluate the power of the tests given a user-specified (predetermined) deviation from the hypothesis to be tested, the level alpha of the test, and the sample size. Thirdly, functions to evaluate the so-called post hoc power of the tests. This is the power of the tests given the observed deviation of the data from the hypothesis to be tested and a user-specified level alpha of the test. Power and sample size computations are based on a Monte Carlo simulation approach. It is computationally very efficient. The variance of the random error in computing power and sample size arising from the simulation approach is analytically derived by using the delta method. Draxler, C., & Alexandrowicz, R. W. (2015), <doi:10.1007/s11336-015-9472-y>.

Maintained by Clemens Draxler. Last updated 6 months ago.

9.7 match 3.00 score

sbgraves237

Ecdat:Data Sets for Econometrics

Data sets for econometrics, including political science.

Maintained by Spencer Graves. Last updated 4 months ago.

4.0 match 2 stars 7.25 score 740 scripts 3 dependents

lcbc-uio

questionnaires:Package with functions to calculate components and sums for LCBC questionnaires

Creates summaries and factorials of answers to questionnaires.

Maintained by Athanasia Mo Mowinckel. Last updated 2 years ago.

5.9 match 3 stars 4.63 score 13 scripts

sigbertklinke

plot.matrix:Visualizes a Matrix as Heatmap

Visualizes a matrix object plainly as heatmap. It provides S3 functions to plot simple matrices and loading matrices.

Maintained by Sigbert Klinke. Last updated 3 years ago.

3.5 match 8 stars 7.63 score 300 scripts 7 dependents

economic

realtalk:Price index data for the US economy

Makes it easy to use US price index data like the CPI.

Maintained by Ben Zipperer. Last updated 4 days ago.

cpidatainflationprices

7.0 match 5 stars 3.51 score 10 scripts

sergiofinances

actfts:Autocorrelation Tools Featured for Time Series

The 'actfts' package provides tools for performing autocorrelation analysis of time series data. It includes functions to compute and visualize the autocorrelation function (ACF) and the partial autocorrelation function (PACF). Additionally, it performs the Dickey-Fuller, KPSS, and Phillips-Perron unit root tests to assess the stationarity of time series. Theoretical foundations are based on Box and Cox (1964) <doi:10.1111/j.2517-6161.1964.tb00553.x>, Box and Jenkins (1976) <isbn:978-0-8162-1234-2>, and Box and Pierce (1970) <doi:10.1080/01621459.1970.10481180>. Statistical methods are also drawn from Kolmogorov (1933) <doi:10.1007/BF00993594>, Kwiatkowski et al. (1992) <doi:10.1016/0304-4076(92)90104-Y>, and Ljung and Box (1978) <doi:10.1093/biomet/65.2.297>. The package integrates functions from 'forecast' (Hyndman & Khandakar, 2008) <https://CRAN.R-project.org/package=forecast>, 'tseries' (Trapletti & Hornik, 2020) <https://CRAN.R-project.org/package=tseries>, 'xts' (Ryan & Ulrich, 2020) <https://CRAN.R-project.org/package=xts>, and 'stats' (R Core Team, 2023) <https://stat.ethz.ch/R-manual/R-devel/library/stats/html/00Index.html>. Additionally, it provides visualization tools via 'plotly' (Sievert, 2020) <https://CRAN.R-project.org/package=plotly> and 'reactable' (Glaz, 2023) <https://CRAN.R-project.org/package=reactable>. The package also incorporates macroeconomic datasets from the U.S. Bureau of Economic Analysis: Disposable Personal Income (DPI) <https://fred.stlouisfed.org/series/DPI>, Gross Domestic Product (GDP) <https://fred.stlouisfed.org/series/GDP>, and Personal Consumption Expenditures (PCEC) <https://fred.stlouisfed.org/series/PCEC>.

Maintained by Sergio Sierra. Last updated 12 days ago.

4.4 match 1 stars 4.74 score

nepem-ufsc

metan:Multi Environment Trials Analysis

Performs stability analysis of multi-environment trial data using parametric and non-parametric methods. Parametric methods includes Additive Main Effects and Multiplicative Interaction (AMMI) analysis by Gauch (2013) <doi:10.2135/cropsci2013.04.0241>, Ecovalence by Wricke (1965), Genotype plus Genotype-Environment (GGE) biplot analysis by Yan & Kang (2003) <doi:10.1201/9781420040371>, geometric adaptability index by Mohammadi & Amri (2008) <doi:10.1007/s10681-007-9600-6>, joint regression analysis by Eberhart & Russel (1966) <doi:10.2135/cropsci1966.0011183X000600010011x>, genotypic confidence index by Annicchiarico (1992), Murakami & Cruz's (2004) method, power law residuals (POLAR) statistics by Doring et al. (2015) <doi:10.1016/j.fcr.2015.08.005>, scale-adjusted coefficient of variation by Doring & Reckling (2018) <doi:10.1016/j.eja.2018.06.007>, stability variance by Shukla (1972) <doi:10.1038/hdy.1972.87>, weighted average of absolute scores by Olivoto et al. (2019a) <doi:10.2134/agronj2019.03.0220>, and multi-trait stability index by Olivoto et al. (2019b) <doi:10.2134/agronj2019.03.0221>. Non-parametric methods includes superiority index by Lin & Binns (1988) <doi:10.4141/cjps88-018>, nonparametric measures of phenotypic stability by Huehn (1990) <doi:10.1007/BF00024241>, TOP third statistic by Fox et al. (1990) <doi:10.1007/BF00040364>. Functions for computing biometrical analysis such as path analysis, canonical correlation, partial correlation, clustering analysis, and tools for inspecting, manipulating, summarizing and plotting typical multi-environment trial data are also provided.

Maintained by Tiago Olivoto. Last updated 9 days ago.

1.8 match 2 stars 9.48 score 1.3k scripts 2 dependents

trinker

textshape:Tools for Reshaping Text

Tools that can be used to reshape and restructure text data.

Maintained by Tyler Rinker. Last updated 12 months ago.

data-reshapingmanipulationsentence-boundary-detectiontext-datatext-formatingtidy

1.8 match 50 stars 9.18 score 266 scripts 34 dependents

huanglabumn

oncoPredict:Drug Response Modeling and Biomarker Discovery

Allows for building drug response models using screening data between bulk RNA-Seq and a drug response metric and two additional tools for biomarker discovery that have been developed by the Huang Laboratory at University of Minnesota. There are 3 main functions within this package. (1) calcPhenotype is used to build drug response models on RNA-Seq data and impute them on any other RNA-Seq dataset given to the model. (2) GLDS is used to calculate the general level of drug sensitivity, which can improve biomarker discovery. (3) IDWAS can take the results from calcPhenotype and link the imputed response back to available genomic (mutation and CNV alterations) to identify biomarkers. Each of these functions comes from a paper from the Huang research laboratory. Below gives the relevant paper for each function. calcPhenotype - Geeleher et al, Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. GLDS - Geeleher et al, Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models. IDWAS - Geeleher et al, Discovering novel pharmacogenomic biomarkers by imputing drug response in cancer patients from large genomics studies.

Maintained by Robert Gruener. Last updated 12 months ago.

svapreprocesscorestringrbiomartgenefilterorg.hs.eg.dbgenomicfeaturestxdb.hsapiens.ucsc.hg19.knowngenetcgabiolinksbiocgenericsgenomicrangesirangess4vectors

2.4 match 18 stars 6.47 score 41 scripts

tverbeke

SDaA:Sampling: Design and Analysis

Functions and Datasets from Lohr, S. (1999), Sampling: Design and Analysis, Duxbury.

Maintained by Tobias Verbeke. Last updated 3 years ago.

6.9 match 2.15 score 14 scripts

usepa

httk:High-Throughput Toxicokinetics

Pre-made models that can be rapidly tailored to various chemicals and species using chemical-specific in vitro data and physiological information. These tools allow incorporation of chemical toxicokinetics ("TK") and in vitro-in vivo extrapolation ("IVIVE") into bioinformatics, as described by Pearce et al. (2017) (<doi:10.18637/jss.v079.i04>). Chemical-specific in vitro data characterizing toxicokinetics have been obtained from relatively high-throughput experiments. The chemical-independent ("generic") physiologically-based ("PBTK") and empirical (for example, one compartment) "TK" models included here can be parameterized with in vitro data or in silico predictions which are provided for thousands of chemicals, multiple exposure routes, and various species. High throughput toxicokinetics ("HTTK") is the combination of in vitro data and generic models. We establish the expected accuracy of HTTK for chemicals without in vivo data through statistical evaluation of HTTK predictions for chemicals where in vivo data do exist. The models are systems of ordinary differential equations that are developed in MCSim and solved using compiled (C-based) code for speed. A Monte Carlo sampler is included for simulating human biological variability (Ring et al., 2017 <doi:10.1016/j.envint.2017.06.004>) and propagating parameter uncertainty (Wambaugh et al., 2019 <doi:10.1093/toxsci/kfz205>). Empirically calibrated methods are included for predicting tissue:plasma partition coefficients and volume of distribution (Pearce et al., 2017 <doi:10.1007/s10928-017-9548-7>). These functions and data provide a set of tools for using IVIVE to convert concentrations from high-throughput screening experiments (for example, Tox21, ToxCast) to real-world exposures via reverse dosimetry (also known as "RTK") (Wetmore et al., 2015 <doi:10.1093/toxsci/kfv171>).

Maintained by John Wambaugh. Last updated 1 months ago.

comptoxord

1.3 match 27 stars 10.22 score 307 scripts 1 dependents

jrosell

jrrosell:Personal R package for Jordi Rosell

Useful functions for personal usage.

Maintained by Jordi Rosell. Last updated 3 months ago.

3.6 match 2 stars 3.08 score 7 scripts

repboxr

GithubActions:Functions to facilitate use of Github Actions via R

Work in progress. Not yet working well.

Maintained by Sebastian Kranz. Last updated 9 months ago.

3.4 match 2 stars 3.26 score 2 dependents