Showing 200 of total 1997 results (show query)
wviechtb
metadat:Meta-Analysis Datasets
A collection of meta-analysis datasets for teaching purposes, illustrating/testing meta-analytic methods, and validating published analyses.
Maintained by Wolfgang Viechtbauer. Last updated 2 days ago.
282.1 match 30 stars 10.54 score 65 scripts 93 dependentsbozenne
LMMstar:Repeated Measurement Models for Discrete Times
Companion R package for the course "Statistical analysis of correlated and repeated measurements for health science researchers" taught by the section of Biostatistics of the University of Copenhagen. It implements linear mixed models where the model for the variance-covariance of the residuals is specified via patterns (compound symmetry, toeplitz, unstructured, ...). Statistical inference for mean, variance, and correlation parameters is performed based on the observed information and a Satterthwaite approximation of the degrees of freedom. Normalized residuals are provided to assess model misspecification. Statistical inference can be performed for arbitrary linear or non-linear combination(s) of model coefficients. Predictions can be computed conditional to covariates only or also to outcome values.
Maintained by Brice Ozenne. Last updated 5 months ago.
77.4 match 4 stars 6.28 score 141 scriptsabbvie-external
OmicNavigator:Open-Source Software for 'Omic' Data Analysis and Visualization
A tool for interactive exploration of the results from 'omics' experiments to facilitate novel discoveries from high-throughput biology. The software includes R functions for the 'bioinformatician' to deposit study metadata and the outputs from statistical analyses (e.g. differential expression, enrichment). These results are then exported to an interactive JavaScript dashboard that can be interrogated on the user's local machine or deployed online to be explored by collaborators. The dashboard includes 'sortable' tables, interactive plots including network visualization, and fine-grained filtering based on statistical significance.
Maintained by John Blischak. Last updated 3 days ago.
bioinformaticsgenomicsomicsopencpu
61.3 match 34 stars 7.68 score 31 scriptsrte-antares-rpackage
antaresEditObject:Edit an 'Antares' Simulation
Edit an 'Antares' simulation before running it : create new areas, links, thermal clusters or binding constraints or edit existing ones. Update 'Antares' general & optimization settings. 'Antares' is an open source power system generator, more information available here : <https://antares-simulator.org/>.
Maintained by Tatiana Vargas. Last updated 27 days ago.
antares-simulationclusterenergymonte-carlo-simulationrte
42.6 match 8 stars 8.76 score 101 scriptsrfhb
ctrdata:Retrieve and Analyze Clinical Trials Data from Public Registers
A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', <https://www.clinicaltrialsregister.eu/>), 'ClinicalTrials.gov' (<https://clinicaltrials.gov/> and also translating queries the retired classic interface), the 'ISRCTN' (<http://www.isrctn.com/>) and the 'European Union Clinical Trials Information System' ('CTIS', <https://euclinicaltrials.eu/>). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Protocols, statistical analysis plans, informed consent sheets and other documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for monitoring, meta- and trend-analysis of the design and conduct as well as of the results of clinical trials across registers.
Maintained by Ralf Herold. Last updated 6 hours ago.
clinical-dataclinical-researchclinical-studiesclinical-trialsctgovdatabaseduckdbmongodbnodbipostgresqlregistersqlitestudiestrial
46.5 match 45 stars 7.92 score 32 scriptsmikewlcheung
metaSEM:Meta-Analysis using Structural Equation Modeling
A collection of functions for conducting meta-analysis using a structural equation modeling (SEM) approach via the 'OpenMx' and 'lavaan' packages. It also implements various procedures to perform meta-analytic structural equation modeling on the correlation and covariance matrices, see Cheung (2015) <doi:10.3389/fpsyg.2014.01521>.
Maintained by Mike Cheung. Last updated 10 days ago.
meta-analysismeta-analytic-semmissing-datamultilevel-modelsmultivariate-analysisstructural-equation-modelingstructural-equation-models
37.5 match 30 stars 9.43 score 208 scripts 1 dependentsalecri
dosresmeta:Multivariate Dose-Response Meta-Analysis
Estimates dose-response relations from summarized dose-response data and to combines them according to principles of (multivariate) random-effects models.
Maintained by Alessio Crippa. Last updated 6 years ago.
53.8 match 11 stars 6.56 score 66 scriptsellessenne
rsimsum:Analysis of Simulation Studies Including Monte Carlo Error
Summarise results from simulation studies and compute Monte Carlo standard errors of commonly used summary statistics. This package is modelled on the 'simsum' user-written command in 'Stata' (White I.R., 2010 <https://www.stata-journal.com/article.html?article=st0200>), further extending it with additional performance measures and functionality.
Maintained by Alessandro Gasparini. Last updated 10 months ago.
biostatisticsmonte-carlo-errorsimulationsimulation-studysimulationsstatistics
42.7 match 28 stars 7.70 score 148 scriptsjulianfaraway
faraway:Datasets and Functions for Books by Julian Faraway
Books are "Linear Models with R" published 1st Ed. August 2004, 2nd Ed. July 2014, 3rd Ed. February 2025 by CRC press, ISBN 9781439887332, and "Extending the Linear Model with R" published by CRC press in 1st Ed. December 2005 and 2nd Ed. March 2016, ISBN 9781584884248 and "Practical Regression and ANOVA in R" contributed documentation on CRAN (now very dated).
Maintained by Julian Faraway. Last updated 1 months ago.
33.0 match 29 stars 9.43 score 1.7k scripts 1 dependentsbioc
mixOmics:Omics Data Integration Project
Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.
Maintained by Eva Hamrud. Last updated 4 days ago.
immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project
22.7 match 182 stars 13.71 score 1.3k scripts 22 dependentsbart1
move:Visualizing and Analyzing Animal Track Data
Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.
Maintained by Bart Kranstauber. Last updated 4 months ago.
30.0 match 8.74 score 690 scripts 3 dependentsjosue-rodriguez
psymetadata:Open Datasets from Meta-Analyses in Psychology Research
Data and examples from meta-analyses in psychology research.
Maintained by Josue E. Rodriguez. Last updated 2 years ago.
74.2 match 1 stars 3.40 score 50 scriptsropensci
rotl:Interface to the 'Open Tree of Life' API
An interface to the 'Open Tree of Life' API to retrieve phylogenetic trees, information about studies used to assemble the synthetic tree, and utilities to match taxonomic names to 'Open Tree identifiers'. The 'Open Tree of Life' aims at assembling a comprehensive phylogenetic tree for all named species.
Maintained by Francois Michonneau. Last updated 2 years ago.
metadataropensciphylogeneticsindependant-contrastsbiodiversitypeer-reviewedphylogenytaxonomy
20.6 match 40 stars 12.05 score 356 scripts 29 dependentshugaped
MBNMAdose:Dose-Response MBNMA Models
Fits Bayesian dose-response model-based network meta-analysis (MBNMA) that incorporate multiple doses within an agent by modelling different dose-response functions, as described by Mawdsley et al. (2016) <doi:10.1002/psp4.12091>. By modelling dose-response relationships this can connect networks of evidence that might otherwise be disconnected, and can improve precision on treatment estimates. Several common dose-response functions are provided; others may be added by the user. Various characteristics and assumptions can be flexibly added to the models, such as shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting at the treatment level.
Maintained by Hugo Pedder. Last updated 1 months ago.
37.4 match 10 stars 6.60 scorekarissawhiting
cbioportalR:Browse and Query Clinical and Genomic Data from cBioPortal
Provides R users with direct access to genomic and clinical data from the 'cBioPortal' web resource via user-friendly functions that wrap 'cBioPortal's' existing API endpoints <https://www.cbioportal.org/api/swagger-ui/index.html>. Users can browse and query genomic data on mutations, copy number alterations and fusions, as well as data on tumor mutational burden ('TMB'), microsatellite instability status ('MSI'), 'FACETS' and select clinical data points (depending on the study). See <https://www.cbioportal.org/> and Gao et al., (2013) <doi:10.1126/scisignal.2004088> for more information on the cBioPortal web resource.
Maintained by Karissa Whiting. Last updated 4 months ago.
35.1 match 21 stars 6.70 score 20 scriptsohdsi
PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model
A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.
Maintained by Egill Fridgeirsson. Last updated 9 days ago.
21.1 match 190 stars 10.85 score 297 scriptskkholst
mets:Analysis of Multivariate Event Times
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
Maintained by Klaus K. Holst. Last updated 2 days ago.
multivariate-time-to-eventsurvival-analysistime-to-eventfortranopenblascpp
15.7 match 14 stars 13.47 score 236 scripts 42 dependentsgenentech
psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation
Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.
Maintained by Matt Secrest. Last updated 1 months ago.
bayesian-dynamic-borrowingpsborrow2simulation-study
26.5 match 18 stars 7.87 score 16 scriptsopenintrostat
openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs
Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.
Maintained by Mine Çetinkaya-Rundel. Last updated 3 months ago.
17.5 match 240 stars 11.39 score 6.0k scriptsramiromagno
gwasrapidd:'REST' 'API' Client for the 'NHGRI'-'EBI' 'GWAS' Catalog
'GWAS' R 'API' Data Download. This package provides easy access to the 'NHGRI'-'EBI' 'GWAS' Catalog data by accessing the 'REST' 'API' <https://www.ebi.ac.uk/gwas/rest/docs/api/>.
Maintained by Ramiro Magno. Last updated 1 years ago.
thirdpartyclientbiomedicalinformaticsgenomewideassociationsnpassociation-studiesgwas-cataloghumanrest-clienttraittrait-ontology
23.8 match 95 stars 8.10 score 49 scripts 1 dependentsstocnet
RSiena:Siena - Simulation Investigation for Empirical Network Analysis
The main purpose of this package is to perform simulation-based estimation of stochastic actor-oriented models for longitudinal network data collected as panel data. Dependent variables can be single or multivariate networks, which can be directed, non-directed, or two-mode; and associated actor variables. There are also functions for testing parameters and checking goodness of fit. An overview of these models is given in Snijders (2017), <doi:10.1146/annurev-statistics-060116-054035>.
Maintained by Tom A.B. Snijders. Last updated 1 months ago.
longitudinal-datarsienasocial-network-analysisstatistical-network-analysisstatisticscpp
18.4 match 107 stars 9.93 score 346 scripts 1 dependentsbioc
RImmPort:RImmPort: Enabling Ready-for-analysis Immunology Research Data
The RImmPort package simplifies access to ImmPort data for analysis in the R environment. It provides a standards-based interface to the ImmPort study data that is in a proprietary format.
Maintained by Zicheng Hu. Last updated 5 months ago.
biomedicalinformaticsdataimportdatarepresentation
41.5 match 4.33 score 27 scriptspharmaverse
sdtmchecks:Data Quality Checks for Study Data Tabulation Model (SDTM) Datasets
A series of checks to identify common issues in Study Data Tabulation Model (SDTM) datasets. These checks are intended to be generalizable, actionable, and meaningful for analysis.
Maintained by Will Harris. Last updated 3 months ago.
23.5 match 21 stars 7.66 score 15 scriptsbxc147
Epi:Statistical Analysis in Epidemiology
Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data. In particular representation, manipulation, rate estimation and simulation for multistate data - the Lexis suite of functions, which includes interfaces to 'mstate', 'etm' and 'cmprsk' packages. Contains functions for Age-Period-Cohort and Lee-Carter modeling and a function for interval censored data and some useful functions for tabulation and plotting, as well as a number of epidemiological data sets.
Maintained by Bendix Carstensen. Last updated 2 months ago.
18.6 match 4 stars 9.65 score 708 scripts 11 dependentsdgbonett
vcmeta:Varying Coefficient Meta-Analysis
Implements functions for varying coefficient meta-analysis methods. These methods do not assume effect size homogeneity. Subgroup effect size comparisons, general linear effect size contrasts, and linear models of effect sizes based on varying coefficient methods can be used to describe effect size heterogeneity. Varying coefficient meta-analysis methods do not require the unrealistic assumptions of the traditional fixed-effect and random-effects meta-analysis methods. For details see: Statistical Methods for Psychologists, Volume 5, <https://dgbonett.sites.ucsc.edu/>.
Maintained by Douglas G. Bonett. Last updated 8 months ago.
58.8 match 1 stars 3.00 score 8 scriptsmwheymans
psfmi:Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets
Pooling, backward and forward selection of linear, logistic and Cox regression models in multiply imputed datasets. Backward and forward selection can be done from the pooled model using Rubin's Rules (RR), the D1, D2, D3, D4 and the median p-values method. This is also possible for Mixed models. The models can contain continuous, dichotomous, categorical and restricted cubic spline predictors and interaction terms between all these type of predictors. The stability of the models can be evaluated using (cluster) bootstrapping. The package further contains functions to pool model performance measures as ROC/AUC, Reclassification, R-squared, scaled Brier score, H&L test and calibration plots for logistic regression models. Internal validation can be done across multiply imputed datasets with cross-validation or bootstrapping. The adjusted intercept after shrinkage of pooled regression coefficients can be obtained. Backward and forward selection as part of internal validation is possible. A function to externally validate logistic prediction models in multiple imputed datasets is available and a function to compare models. For Cox models a strata variable can be included. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>.
Maintained by Martijn Heymans. Last updated 2 years ago.
cox-regressionimputationimputed-datasetslogisticmultiple-imputationpoolpredictorregressionselectionsplinespline-predictors
24.3 match 10 stars 7.17 score 70 scriptsisglobal-brge
SNPassoc:SNPs-Based Whole Genome Association Studies
Functions to perform most of the common analysis in genome association studies are implemented. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Permutation test and related tests (sum statistic and truncated product) are also implemented. Max-statistic and genetic risk-allele score exact distributions are also possible to be estimated. The methods are described in Gonzalez JR et al., 2007 <doi: 10.1093/bioinformatics/btm025>.
Maintained by Dolors Pelegri. Last updated 5 months ago.
19.0 match 16 stars 9.14 score 89 scripts 6 dependentsdsstoffer
astsa:Applied Statistical Time Series Analysis
Contains data sets and scripts for analyzing time series in both the frequency and time domains including state space modeling as well as supporting the texts Time Series Analysis and Its Applications: With R Examples (5th ed), by R.H. Shumway and D.S. Stoffer. Springer Texts in Statistics, 2025, <https://link.springer.com/book/9783031705830>, and Time Series: A Data Analysis Approach Using R. Chapman-Hall, 2019, <DOI:10.1201/9780429273285>.
Maintained by David Stoffer. Last updated 2 months ago.
21.8 match 7 stars 7.88 score 2.2k scripts 8 dependentsngreifer
WeightIt:Weighting for Covariate Balance in Observational Studies
Generates balancing weights for causal effect estimation in observational studies with binary, multi-category, or continuous point or longitudinal treatments by easing and extending the functionality of several R packages and providing in-house estimation methods. Available methods include those that rely on parametric modeling, optimization, and machine learning. Also allows for assessment of weights and checking of covariate balance by interfacing directly with the 'cobalt' package. Methods for estimating weighted regression models that take into account uncertainty in the estimation of the weights via M-estimation or bootstrapping are available. See the vignette "Installing Supporting Packages" for instructions on how to install any package 'WeightIt' uses, including those that may not be on CRAN.
Maintained by Noah Greifer. Last updated 5 days ago.
causal-inferenceinverse-probability-weightsobservational-studypropensity-scores
14.4 match 112 stars 11.58 score 508 scripts 3 dependentsrcalinjageman
esci:Estimation Statistics with Confidence Intervals
A collection of functions and 'jamovi' module for the estimation approach to inferential statistics, the approach which emphasizes effect sizes, interval estimates, and meta-analysis. Nearly all functions are based on 'statpsych' and 'metafor'. This package is still under active development, and breaking changes are likely, especially with the plot and hypothesis test functions. Data sets are included for all examples from Cumming & Calin-Jageman (2024) <ISBN:9780367531508>.
Maintained by Robert Calin-Jageman. Last updated 22 days ago.
jamovijaspsciencestatisticsvisualization
29.8 match 22 stars 5.42 score 12 scriptsbioc
bioCancer:Interactive Multi-Omics Cancers Data Visualization and Analysis
This package is a Shiny App to visualize and analyse interactively Multi-Assays of Cancer Genomic Data.
Maintained by Karim Mezhoud. Last updated 5 months ago.
guidatarepresentationnetworkmultiplecomparisonpathwaysreactomevisualizationgeneexpressiongenetargetanalysisbiocancer-interfacecancercancer-studiesrmarkdown
26.6 match 20 stars 5.95 score 7 scriptshiggi13425
medicaldata:Data Package for Medical Datasets
Provides access to well-documented medical datasets for teaching. Featuring several from the Teaching of Statistics in the Health Sciences website <https://www.causeweb.org/tshs/category/dataset/>, a few reconstructed datasets of historical significance in medical research, some reformatted and extended from existing R packages, and some data donations.
Maintained by Peter Higgins. Last updated 2 years ago.
21.0 match 48 stars 7.43 score 317 scriptsasheshrambachan
HonestDiD:Robust Inference in Difference-in-Differences and Event Study Designs
Provides functions to conduct robust inference in difference-in-differences and event study designs by implementing the methods developed in Rambachan & Roth (2023) <doi:10.1093/restud/rdad018>, "A More Credible Approach to Parallel Trends" [Previously titled "An Honest Approach..."]. Inference is conducted under a weaker version of the parallel trends assumption. Uniformly valid confidence sets are constructed based upon conditional confidence sets, fixed-length confidence sets and hybridized confidence sets.
Maintained by Ashesh Rambachan. Last updated 18 days ago.
difference-in-differencesevent-studiesrobust-inference
20.9 match 195 stars 7.11 score 63 scriptsmattheaphy
actxps:Create Actuarial Experience Studies: Prepare Data, Summarize Results, and Create Reports
Experience studies are used by actuaries to explore historical experience across blocks of business and to inform assumption setting activities. This package provides functions for preparing data, creating studies, visualizing results, and beginning assumption development. Experience study methods, including exposure calculations, are described in: Atkinson & McGarry (2016) "Experience Study Calculations" <https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf>. The limited fluctuation credibility method used by the 'exp_stats()' function is described in: Herzog (1999, ISBN:1-56698-374-6) "Introduction to Credibility Theory".
Maintained by Matt Heaphy. Last updated 2 months ago.
23.1 match 14 stars 6.38 score 23 scriptstherneau
survival:Survival Analysis
Contains the core survival analysis routines, including definition of Surv objects, Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models, and parametric accelerated failure time models.
Maintained by Terry M Therneau. Last updated 3 months ago.
7.2 match 400 stars 20.43 score 29k scripts 3.9k dependentsbioc
metagenomeSeq:Statistical analysis for sparse high-throughput sequencing
metagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples. metagenomeSeq is designed to address the effects of both normalization and under-sampling of microbial communities on disease association detection and the testing of feature correlations.
Maintained by Joseph N. Paulson. Last updated 3 months ago.
immunooncologyclassificationclusteringgeneticvariabilitydifferentialexpressionmicrobiomemetagenomicsnormalizationvisualizationmultiplecomparisonsequencingsoftware
12.0 match 69 stars 12.02 score 494 scripts 7 dependentsphuse-org
sendigR:Enable Cross-Study Analysis of 'CDISC' 'SEND' Datasets
A system enables cross study Analysis by extracting and filtering study data for control animals from 'CDISC' 'SEND' Study Repository. These data types are supported: Body Weights, Laboratory test results and Microscopic findings. These database types are supported: 'SQLite' and 'Oracle'.
Maintained by Wenxian Wang. Last updated 10 days ago.
22.8 match 12 stars 6.28 score 6 scriptsalexanderrobitzsch
sirt:Supplementary Item Response Theory Models
Supplementary functions for item response models aiming to complement existing R packages. The functionality includes among others multidimensional compensatory and noncompensatory IRT models (Reckase, 2009, <doi:10.1007/978-0-387-89976-3>), MCMC for hierarchical IRT models and testlet models (Fox, 2010, <doi:10.1007/978-1-4419-0742-4>), NOHARM (McDonald, 1982, <doi:10.1177/014662168200600402>), Rasch copula model (Braeken, 2011, <doi:10.1007/s11336-010-9190-4>; Schroeders, Robitzsch & Schipolowski, 2014, <doi:10.1111/jedm.12054>), faceted and hierarchical rater models (DeCarlo, Kim & Johnson, 2011, <doi:10.1111/j.1745-3984.2011.00143.x>), ordinal IRT model (ISOP; Scheiblechner, 1995, <doi:10.1007/BF02301417>), DETECT statistic (Stout, Habing, Douglas & Kim, 1996, <doi:10.1177/014662169602000403>), local structural equation modeling (LSEM; Hildebrandt, Luedtke, Robitzsch, Sommer & Wilhelm, 2016, <doi:10.1080/00273171.2016.1142856>).
Maintained by Alexander Robitzsch. Last updated 3 months ago.
item-response-theoryopenblascpp
14.0 match 23 stars 10.01 score 280 scripts 22 dependentsjohnjsl7
daewr:Design and Analysis of Experiments with R
Contains Data frames and functions used in the book "Design and Analysis of Experiments with R", Lawson(2015) ISBN-13:978-1-4398-6813-3.
Maintained by John Lawson. Last updated 2 years ago.
35.8 match 3 stars 3.83 score 217 scripts 3 dependentslindanab
mecor:Measurement Error Correction in Linear Models with a Continuous Outcome
Covariate measurement error correction is implemented by means of regression calibration by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331), efficient regression calibration by Spiegelman D, Carroll RJ & Kipnis V (2001) <doi:10.1002/1097-0258(20010115)20:1%3C139::AID-SIM644%3E3.0.CO;2-K> and maximum likelihood estimation by Bartlett JW, Stavola DBL & Frost C (2009) <doi:10.1002/sim.3713>. Outcome measurement error correction is implemented by means of the method of moments by Buonaccorsi JP (2010, ISBN:1420066560) and efficient method of moments by Keogh RH, Carroll RJ, Tooze JA, Kirkpatrick SI & Freedman LS (2014) <doi:10.1002/sim.7011>. Standard error estimation of the corrected estimators is implemented by means of the Delta method by Rosner B, Spiegelman D & Willett WC (1990) <doi:10.1093/oxfordjournals.aje.a115715> and Rosner B, Spiegelman D & Willett WC (1992) <doi:10.1093/oxfordjournals.aje.a116453>, the Fieller method described by Buonaccorsi JP (2010, ISBN:1420066560), and the Bootstrap by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331).
Maintained by Linda Nab. Last updated 3 years ago.
linear-modelsmeasurement-errorstatistics
26.1 match 6 stars 5.07 score 13 scriptsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurélie Siberchicot. Last updated 12 days ago.
8.8 match 39 stars 14.96 score 2.2k scripts 256 dependentsr-forge
coin:Conditional Inference Procedures in a Permutation Test Framework
Conditional inference procedures for the general independence problem including two-sample, K-sample (non-parametric ANOVA), correlation, censored, ordered and multivariate problems described in <doi:10.18637/jss.v028.i08>.
Maintained by Torsten Hothorn. Last updated 9 months ago.
11.1 match 11.68 score 1.6k scripts 74 dependentsbioc
metaCCA:Summary Statistics-Based Multivariate Meta-Analysis of Genome-Wide Association Studies Using Canonical Correlation Analysis
metaCCA performs multivariate analysis of a single or multiple GWAS based on univariate regression coefficients. It allows multivariate representation of both phenotype and genotype. metaCCA extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.
Maintained by Anna Cichonska. Last updated 5 months ago.
genomewideassociationsnpgeneticsregressionstatisticalmethodsoftware
30.1 match 4.26 score 5 scriptsbioc
cBioPortalData:Exposes and Makes Available Data from the cBioPortal Web Resources
The cBioPortalData R package accesses study datasets from the cBio Cancer Genomics Portal. It accesses the data either from the pre-packaged zip / tar files or from the API interface that was recently implemented by the cBioPortal Data Team. The package can provide data in either tabular format or with MultiAssayExperiment object that uses familiar Bioconductor data representations.
Maintained by Marcel Ramos. Last updated 10 days ago.
softwareinfrastructurethirdpartyclientbioconductor-packagenci-itcru24ca289073
12.6 match 33 stars 10.15 score 147 scripts 4 dependentsmucollective
multiverse:Create 'multiverse analysis' in R
Implement 'multiverse' style analyses (Steegen S., Tuerlinckx F, Gelman A., Vanpaemal, W., 2016) <doi:10.1177/1745691616658637> to show the robustness of statistical inference. 'Multiverse analysis' is a philosophy of statistical reporting where paper authors report the outcomes of many different statistical analyses in order to show how fragile or robust their findings are. The 'multiverse' package (Sarma A., Kale A., Moon M., Taback N., Chevalier F., Hullman J., Kay M., 2021) <doi:10.31219/osf.io/yfbwm> allows users to concisely and flexibly implement 'multiverse-style' analysis, which involve declaring alternate ways of performing an analysis step, in R and R Notebooks.
Maintained by Abhraneel Sarma. Last updated 4 months ago.
15.1 match 62 stars 8.37 score 42 scriptsamices
mice:Multivariate Imputation by Chained Equations
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
Maintained by Stef van Buuren. Last updated 6 days ago.
chained-equationsfcsimputationmicemissing-datamissing-valuesmultiple-imputationmultivariate-datacpp
7.6 match 462 stars 16.50 score 10k scripts 154 dependentsnickch-k
causaldata:Example Data Sets for Causal Inference Textbooks
Example data sets to run the example problems from causal inference textbooks. Currently, contains data sets for Huntington-Klein, Nick (2021) "The Effect" <https://theeffectbook.net>, first and second edition, Cunningham, Scott (2021, ISBN-13: 978-0-300-25168-5) "Causal Inference: The Mixtape", and Hernán, Miguel and James Robins (2020) "Causal Inference: What If" <https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/>.
Maintained by Nick Huntington-Klein. Last updated 4 months ago.
16.9 match 136 stars 7.43 score 144 scripts 1 dependentsalanarnholt
BSDA:Basic Statistics and Data Analysis
Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.
Maintained by Alan T. Arnholt. Last updated 2 years ago.
13.6 match 7 stars 9.11 score 1.3k scripts 6 dependentsr-forge
Sleuth2:Data Sets from Ramsey and Schafer's "Statistical Sleuth (2nd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2002), "The Statistical Sleuth: A Course in Methods of Data Analysis (2nd ed)", Duxbury.
Maintained by Berwin A Turlach. Last updated 1 years ago.
21.7 match 5.70 score 191 scriptsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 27 days ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
8.2 match 305 stars 14.45 score 1.6k scripts 6 dependentsbioc
GenomicSuperSignature:Interpretation of RNA-seq experiments through robust, efficient comparison to public databases
This package provides a novel method for interpreting new transcriptomic datasets through near-instantaneous comparison to public archives without high-performance computing requirements. Through the pre-computed index, users can identify public resources associated with their dataset such as gene sets, MeSH term, and publication. Functions to identify interpretable annotations and intuitive visualization options are implemented in this package.
Maintained by Sehyun Oh. Last updated 5 months ago.
transcriptomicssystemsbiologyprincipalcomponentrnaseqsequencingpathwaysclusteringbioconductor-packageexploratory-data-analysisgseameshprincipal-component-analysisrna-sequencing-profilestransferlearning
16.9 match 16 stars 6.97 score 59 scriptsbioc
microbiome:Microbiome Analytics
Utilities for microbiome analysis.
Maintained by Leo Lahti. Last updated 5 months ago.
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
9.4 match 290 stars 12.50 score 2.0k scripts 5 dependentsr-forge
Sleuth3:Data Sets from Ramsey and Schafer's "Statistical Sleuth (3rd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2013), "The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)", Cengage Learning.
Maintained by Berwin A Turlach. Last updated 1 years ago.
17.9 match 6.38 score 522 scriptsscheike
timereg:Flexible Regression Models for Survival Data
Programs for Martinussen and Scheike (2006), `Dynamic Regression Models for Survival Data', Springer Verlag. Plus more recent developments. Additive survival model, semiparametric proportional odds model, fast cumulative residuals, excess risk models and more. Flexible competing risks regression including GOF-tests. Two-stage frailty modelling. PLS for the additive risk model. Lasso in the 'ahaz' package.
Maintained by Thomas Scheike. Last updated 6 months ago.
10.9 match 31 stars 10.42 score 289 scripts 44 dependentsfbartos
RoBMA:Robust Bayesian Meta-Analyses
A framework for estimating ensembles of meta-analytic and meta-regression models (assuming either presence or absence of the effect, heterogeneity, publication bias, and moderators). The RoBMA framework uses Bayesian model-averaging to combine the competing meta-analytic models into a model ensemble, weights the posterior parameter distributions based on posterior model probabilities and uses Bayes factors to test for the presence or absence of the individual components (e.g., effect vs. no effect; Bartoš et al., 2022, <doi:10.1002/jrsm.1594>; Maier, Bartoš & Wagenmakers, 2022, <doi:10.1037/met0000405>). Users can define a wide range of prior distributions for + the effect size, heterogeneity, publication bias (including selection models and PET-PEESE), and moderator components. The package provides convenient functions for summary, visualizations, and fit diagnostics.
Maintained by František Bartoš. Last updated 1 months ago.
meta-analysismodel-averagingpublication-biasjagsopenblascpp
16.3 match 9 stars 6.97 score 53 scriptsldliao
jointVIP:Prioritize Variables with Joint Variable Importance Plot in Observational Study Design
In the observational study design stage, matching/weighting methods are conducted. However, when many background variables are present, the decision as to which variables to prioritize for matching/weighting is not trivial. Thus, the joint treatment-outcome variable importance plots are created to guide variable selection. The joint variable importance plots enhance variable comparisons via unadjusted bias curves derived under the omitted variable bias framework. The plots translate variable importance into recommended values for tuning parameters in existing methods. Post-matching and/or weighting plots can also be used to visualize and assess the quality of the observational study design. The method motivation and derivation is presented in "Prioritizing Variables for Observational Study Design using the Joint Variable Importance Plot" by Liao et al. (2024) <doi:10.1080/00031305.2024.2303419>. See the package paper by Liao and Pimentel (2023) <arxiv:2302.10367> for a beginner friendly user introduction.
Maintained by Lauren D. Liao. Last updated 1 months ago.
causal-inferenceobservational-studystudy-design
18.7 match 6 stars 6.05 score 27 scriptseldafani
intsvy:International Assessment Data Manager
Provides tools for importing, merging, and analysing data from international assessment studies (TIMSS, PIRLS, PISA, ICILS, and PIAAC).
Maintained by Daniel Caro. Last updated 12 months ago.
21.2 match 22 stars 5.29 score 88 scriptsohdsi
Characterization:Implement Descriptive Studies Using the Common Data Model
An end-to-end framework that enables users to implement various descriptive studies for a given set of target and outcome cohorts for data mapped to the Observational Medical Outcomes Partnership Common Data Model.
Maintained by Jenna Reps. Last updated 17 days ago.
17.8 match 3 stars 6.13 scorecran
nparLD:Nonparametric Analysis of Longitudinal Data in Factorial Experiments
Performs nonparametric analysis of longitudinal data in factorial experiments. Longitudinal data are those which are collected from the same subjects over time, and they frequently arise in biological sciences. Nonparametric methods do not require distributional assumptions, and are applicable to a variety of data types (continuous, discrete, purely ordinal, and dichotomous). Such methods are also robust with respect to outliers and for small sample sizes.
Maintained by Frank Konietschke. Last updated 3 years ago.
32.0 match 4 stars 3.31 score 51 scriptsstefvanbuuren
AGD:Analysis of Growth Data
Tools for the analysis of growth data: to extract an LMS table from a gamlss object, to calculate the standard deviation scores and its inverse, and to superpose two wormplots from different models. The package contains a some varieties of reference tables, especially for The Netherlands.
Maintained by Stef van Buuren. Last updated 11 months ago.
anthropometrycdcdutchgrowthgrowth-chartslmswhoz-score
24.0 match 1 stars 4.38 score 48 scriptsbioc
recount:Explore and download data from the recount project
Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportimmunooncologyannotation-agnosticbioconductorcountderfinderdeseq2exongenehumanilluminajunctionrecount
10.9 match 41 stars 9.57 score 498 scripts 3 dependentsgasparrini
mixmeta:An Extended Mixed-Effects Framework for Meta-Analysis
A collection of functions to perform various meta-analytical models through a unified mixed-effects framework, including standard univariate fixed and random-effects meta-analysis and meta-regression, and non-standard extensions such as multivariate, multilevel, longitudinal, and dose-response models.
Maintained by Antonio Gasparrini. Last updated 3 years ago.
14.9 match 13 stars 6.96 score 63 scripts 13 dependentshta-pharma
maicplus:Matching Adjusted Indirect Comparison
Facilitates performing matching adjusted indirect comparison (MAIC) analysis where the endpoint of interest is either time-to-event (e.g. overall survival) or binary (e.g. objective tumor response). The method is described by Signorovitch et al (2012) <doi:10.1016/j.jval.2012.05.004>.
Maintained by Isaac Gravestock. Last updated 23 days ago.
13.9 match 5 stars 7.37 score 16 scriptsmlr-org
mlr3proba:Probabilistic Supervised Learning for 'mlr3'
Provides extensions for probabilistic supervised learning for 'mlr3'. This includes extending the regression task to probabilistic and interval regression, adding a survival task, and other specialized models, predictions, and measures.
Maintained by John Zobolas. Last updated 2 months ago.
density-estimationmachine-learningmlr3probabilistic-regressionprobabilistic-supervised-learningsupervised-learningsurvival-analysiscpp
12.2 match 135 stars 7.78 score 246 scriptsropensci
excluder:Checks for Exclusion Criteria in Online Data
Data that are collected through online sources such as Mechanical Turk may require excluding rows because of IP address duplication, geolocation, or completion duration. This package facilitates exclusion of these data for Qualtrics datasets.
Maintained by Jeffrey R. Stevens. Last updated 11 days ago.
datacleaningexclusionmturkqualtrics
16.9 match 9 stars 5.51 score 18 scriptscran
epiR:Tools for the Analysis of Epidemiological Data
Tools for the analysis of epidemiological and surveillance data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, computation of confidence intervals around incidence risk and incidence rate estimates and sample size calculations for cross-sectional, case-control and cohort studies. Surveillance tools include functions to calculate an appropriate sample size for 1- and 2-stage representative freedom surveys, functions to estimate surveillance system sensitivity and functions to support scenario tree modelling analyses.
Maintained by Mark Stevenson. Last updated 2 months ago.
11.3 match 10 stars 8.18 score 10 dependentsbioc
MMUPHin:Meta-analysis Methods with Uniform Pipeline for Heterogeneity in Microbiome Studies
MMUPHin is an R package for meta-analysis tasks of microbiome cohorts. It has function interfaces for: a) covariate-controlled batch- and cohort effect adjustment, b) meta-analysis differential abundance testing, c) meta-analysis unsupervised discrete structure (clustering) discovery, and d) meta-analysis unsupervised continuous structure discovery.
Maintained by Siyuan MA. Last updated 5 months ago.
metagenomicsmicrobiomebatcheffect
20.6 match 4.44 score 46 scriptsmrcieu
ieugwasr:Interface to the 'OpenGWAS' Database API
Interface to the 'OpenGWAS' database API <https://api.opengwas.io/api/>. Includes a wrapper to make generic calls to the API, plus convenience functions for specific queries.
Maintained by Gibran Hemani. Last updated 3 days ago.
8.5 match 89 stars 10.71 score 404 scripts 6 dependentsbayesball
LearnBayes:Learning Bayesian Inference
Contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.
Maintained by Jim Albert. Last updated 7 years ago.
8.0 match 38 stars 11.34 score 690 scripts 31 dependentsr-forge
carData:Companion to Applied Regression Data Sets
Datasets to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage (2019).
Maintained by John Fox. Last updated 5 months ago.
7.1 match 12.41 score 944 scripts 919 dependentsguido-s
meta:General Package for Meta-Analysis
User-friendly general package providing standard methods for meta-analysis and supporting Schwarzer, Carpenter, and Rücker <DOI:10.1007/978-3-319-21416-0>, "Meta-Analysis with R" (2015): - common effect and random effects meta-analysis; - several plots (forest, funnel, Galbraith / radial, L'Abbe, Baujat, bubble); - three-level meta-analysis model; - generalised linear mixed model; - logistic regression with penalised likelihood for rare events; - Hartung-Knapp method for random effects model; - Kenward-Roger method for random effects model; - prediction interval; - statistical tests for funnel plot asymmetry; - trim-and-fill method to evaluate bias in meta-analysis; - meta-regression; - cumulative meta-analysis and leave-one-out meta-analysis; - import data from 'RevMan 5'; - produce forest plot summarising several (subgroup) meta-analyses.
Maintained by Guido Schwarzer. Last updated 25 days ago.
5.8 match 84 stars 14.84 score 2.3k scripts 29 dependentsgpaux
Mediana:Clinical Trial Simulations
Provides a general framework for clinical trial simulations based on the Clinical Scenario Evaluation (CSE) approach. The package supports a broad class of data models (including clinical trials with continuous, binary, survival-type and count-type endpoints as well as multivariate outcomes that are based on combinations of different endpoints), analysis strategies and commonly used evaluation criteria.
Maintained by Gautier Paux. Last updated 4 years ago.
biostatisticsclinical-trial-simulationsclinical-trialssimulations
13.3 match 28 stars 6.52 score 39 scriptsbrockk
trialr:Clinical Trial Designs in 'rstan'
A collection of clinical trial designs and methods, implemented in 'rstan' and R, including: the Continual Reassessment Method by O'Quigley et al. (1990) <doi:10.2307/2531628>; EffTox by Thall & Cook (2004) <doi:10.1111/j.0006-341X.2004.00218.x>; the two-parameter logistic method of Neuenschwander, Branson & Sponer (2008) <doi:10.1002/sim.3230>; and the Augmented Binary method by Wason & Seaman (2013) <doi:10.1002/sim.5867>; and more. We provide functions to aid model-fitting and analysis. The 'rstan' implementations may also serve as a cookbook to anyone looking to extend or embellish these models. We hope that this package encourages the use of Bayesian methods in clinical trials. There is a preponderance of early phase trial designs because this is where Bayesian methods are used most. If there is a method you would like implemented, please get in touch.
Maintained by Kristian Brock. Last updated 1 years ago.
10.1 match 41 stars 8.55 score 106 scripts 3 dependentsguido-s
diagmeta:Meta-Analysis of Diagnostic Accuracy Studies with Several Cutpoints
Provides methods by Steinhauser et al. (2016) <DOI:10.1186/s12874-016-0196-1> for meta-analysis of diagnostic accuracy studies with several cutpoints.
Maintained by Guido Schwarzer. Last updated 6 months ago.
diagnostic-accuracy-studiesmeta-analysisrstudio
16.7 match 4 stars 5.15 score 10 scriptsdetlew
PowerTOST:Power and Sample Size for (Bio)Equivalence Studies
Contains functions to calculate power and sample size for various study designs used in bioequivalence studies. Use known.designs() to see the designs supported. Power and sample size can be obtained based on different methods, amongst them prominently the TOST procedure (two one-sided t-tests). See README and NEWS for further information.
Maintained by Detlew Labes. Last updated 12 months ago.
8.9 match 20 stars 9.61 score 112 scripts 4 dependentsjinghuazhao
gap:Genetic Analysis Package
As first reported [Zhao, J. H. 2007. "gap: Genetic Analysis Package". J Stat Soft 23(8):1-18. <doi:10.18637/jss.v023.i08>], it is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).
Maintained by Jing Hua Zhao. Last updated 16 days ago.
7.1 match 12 stars 11.88 score 448 scripts 16 dependentscran
epiDisplay:Epidemiological Data Display Package
Package for data exploration and result presentation. Full 'epicalc' package with data management functions is available at '<https://medipe.psu.ac.th/epicalc/>'.
Maintained by Virasakdi Chongsuvivatwong. Last updated 3 years ago.
15.3 match 1 stars 5.44 score 758 scripts 2 dependentsicarda-git
QBMS:Query the Breeding Management System(s)
This R package assists breeders in linking data systems with their analytic pipelines, a crucial step in digitizing breeding processes. It supports querying and retrieving phenotypic and genotypic data from systems like 'EBS' <https://ebs.excellenceinbreeding.org/>, 'BMS' <https://bmspro.io>, 'BreedBase' <https://breedbase.org>, and 'GIGWA' <https://github.com/SouthGreenPlatform/Gigwa2> (using 'BrAPI' <https://brapi.org> calls). Extra helper functions support environmental data sources, including 'TerraClimate' <https://www.climatologylab.org/terraclimate.html> and 'FAO' 'HWSDv2' <https://gaez.fao.org/pages/hwsd> soil database.
Maintained by Khaled Al-Shamaa. Last updated 6 months ago.
10.5 match 8 stars 7.85 score 33 scripts 1 dependentsgasparrini
mvmeta:Multivariate and Univariate Meta-Analysis and Meta-Regression
Collection of functions to perform fixed and random-effects multivariate and univariate meta-analysis and meta-regression.
Maintained by Antonio Gasparrini. Last updated 5 years ago.
11.4 match 6 stars 7.29 score 151 scripts 10 dependentsdwarton
ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)
Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.
Maintained by David Warton. Last updated 1 years ago.
12.4 match 8 stars 6.58 score 53 scriptssahirbhatnagar
casebase:Fitting Flexible Smooth-in-Time Hazards and Risk Functions via Logistic and Multinomial Regression
Fit flexible and fully parametric hazard regression models to survival data with single event type or multiple competing causes via logistic and multinomial regression. Our formulation allows for arbitrary functional forms of time and its interactions with other predictors for time-dependent hazards and hazard ratios. From the fitted hazard model, we provide functions to readily calculate and plot cumulative incidence and survival curves for a given covariate profile. This approach accommodates any log-linear hazard function of prognostic time, treatment, and covariates, and readily allows for non-proportionality. We also provide a plot method for visualizing incidence density via population time plots. Based on the case-base sampling approach of Hanley and Miettinen (2009) <DOI:10.2202/1557-4679.1125>, Saarela and Arjas (2015) <DOI:10.1111/sjos.12125>, and Saarela (2015) <DOI:10.1007/s10985-015-9352-x>.
Maintained by Sahir Bhatnagar. Last updated 7 months ago.
competing-riskscox-regressionregression-modelssurvival-analysis
11.4 match 9 stars 7.16 score 94 scriptsgforge
forestplot:Advanced Forest Plot Using 'grid' Graphics
Allows the creation of forest plots with advanced features, such as multiple confidence intervals per row, customizable fonts for individual text elements, and flexible confidence interval drawing. It also supports mixing text with mathematical expressions. The package extends the application of forest plots beyond traditional meta-analyses, offering a more general version of the original 'rmeta' package’s forestplot() function. It relies heavily on the 'grid' package for rendering the plots.
Maintained by Max Gordon. Last updated 4 months ago.
7.0 match 43 stars 11.47 score 716 scripts 21 dependentschstock
DTComPair:Comparison of Binary Diagnostic Tests in a Paired Study Design
Comparison of the accuracy of two binary diagnostic tests in a "paired" study design, i.e. when each test is applied to each subject in the study.
Maintained by Christian Stock. Last updated 5 months ago.
clinical-epidemiologycomparative-analysisdiagnosisdiagnostic-accuracy-studiesdiagnostic-likelihood-ratiodiagnostic-testsmedicinepredictive-valuesensitivityspecificity
15.4 match 1 stars 5.07 score 47 scriptsbioc
nullranges:Generation of null ranges via bootstrapping or covariate matching
Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.
Maintained by Michael Love. Last updated 5 months ago.
visualizationgenesetenrichmentfunctionalgenomicsepigeneticsgeneregulationgenetargetgenomeannotationannotationgenomewideassociationhistonemodificationchipseqatacseqdnaseseqrnaseqhiddenmarkovmodelbioconductorbootstrapgenomicsmatchingstatistics
9.5 match 27 stars 8.16 score 50 scripts 1 dependentscran
metRology:Support for Metrological Applications
Provides classes and calculation and plotting functions for metrology applications, including measurement uncertainty estimation and inter-laboratory metrology comparison studies.
Maintained by Stephen L R Ellison. Last updated 2 months ago.
16.3 match 5 stars 4.77 score 223 scripts 7 dependentsbcallaway11
did:Treatment Effects with Multiple Periods and Groups
The standard Difference-in-Differences (DID) setup involves two periods and two groups -- a treated group and untreated group. Many applications of DID methods involve more than two periods and have individuals that are treated at different points in time. This package contains tools for computing average treatment effect parameters in Difference in Differences setups with more than two periods and with variation in treatment timing using the methods developed in Callaway and Sant'Anna (2021) <doi:10.1016/j.jeconom.2020.12.001>. The main parameters are group-time average treatment effects which are the average treatment effect for a particular group at a a particular time. These can be aggregated into a fewer number of treatment effect parameters, and the package deals with the cases where there is selective treatment timing, dynamic treatment effects, calendar time effects, or combinations of these. There are also functions for testing the Difference in Differences assumption, and plotting group-time average treatment effects.
Maintained by Brantly Callaway. Last updated 4 months ago.
6.4 match 327 stars 12.01 score 696 scripts 3 dependentsglobalecologylab
poems:Pattern-Oriented Ensemble Modeling System
A framework of interoperable R6 classes (Chang, 2020, <https://CRAN.R-project.org/package=R6>) for building ensembles of viable models via the pattern-oriented modeling (POM) approach (Grimm et al.,2005, <doi:10.1126/science.1116681>). The package includes classes for encapsulating and generating model parameters, and managing the POM workflow. The workflow includes: model setup; generating model parameters via Latin hyper-cube sampling (Iman & Conover, 1980, <doi:10.1080/03610928008827996>); running multiple sampled model simulations; collating summary results; and validating and selecting an ensemble of models that best match known patterns. By default, model validation and selection utilizes an approximate Bayesian computation (ABC) approach (Beaumont et al., 2002, <doi:10.1093/genetics/162.4.2025>), although alternative user-defined functionality could be employed. The package includes a spatially explicit demographic population model simulation engine, which incorporates default functionality for density dependence, correlated environmental stochasticity, stage-based transitions, and distance-based dispersal. The user may customize the simulator by defining functionality for translocations, harvesting, mortality, and other processes, as well as defining the sequence order for the simulator processes. The framework could also be adapted for use with other model simulators by utilizing its extendable (inheritable) base classes.
Maintained by July Pilowsky. Last updated 20 days ago.
biogeographypopulation-modelprocess-based
9.6 match 10 stars 8.05 score 59 scripts 2 dependentsamerican-institutes-for-research
EdSurvey:Analysis of NCES Education Survey and Assessment Data
Read in and analyze functions for education survey and assessment data from the National Center for Education Statistics (NCES) <https://nces.ed.gov/>, including National Assessment of Educational Progress (NAEP) data <https://nces.ed.gov/nationsreportcard/> and data from the International Assessment Database: Organisation for Economic Co-operation and Development (OECD) <https://www.oecd.org/en/about/directorates/directorate-for-education-and-skills.html>, including Programme for International Student Assessment (PISA), Teaching and Learning International Survey (TALIS), Programme for the International Assessment of Adult Competencies (PIAAC), and International Association for the Evaluation of Educational Achievement (IEA) <https://www.iea.nl/>, including Trends in International Mathematics and Science Study (TIMSS), TIMSS Advanced, Progress in International Reading Literacy Study (PIRLS), International Civic and Citizenship Study (ICCS), International Computer and Information Literacy Study (ICILS), and Civic Education Study (CivEd).
Maintained by Paul Bailey. Last updated 16 days ago.
9.6 match 10 stars 7.86 score 139 scripts 1 dependentsinsightsengineering
chevron:Standard TLGs for Clinical Trials Reporting
Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.
Maintained by Joe Zhu. Last updated 24 days ago.
clinical-trialsgraphslistingsnestreportingtables
9.2 match 12 stars 8.24 score 12 scriptsbedapub
designit:Blocking and Randomization for Experimental Design
Intelligently assign samples to batches in order to reduce batch effects. Batch effects can have a significant impact on data analysis, especially when the assignment of samples to batches coincides with the contrast groups being studied. By defining a batch container and a scoring function that reflects the contrasts, this package allows users to assign samples in a way that minimizes the potential impact of batch effects on the comparison of interest. Among other functionality, we provide an implementation for OSAT score by Yan et al. (2012, <doi:10.1186/1471-2164-13-689>).
Maintained by Iakov I. Davydov. Last updated 4 months ago.
design-of-experimentsrandomization
10.3 match 8 stars 7.28 score 24 scriptskosukehamazaki
RAINBOWR:Genome-Wide Association Study with SNP-Set Methods
By using 'RAINBOWR' (Reliable Association INference By Optimizing Weights with R), users can test multiple SNPs (Single Nucleotide Polymorphisms) simultaneously by kernel-based (SNP-set) methods. This package can also be applied to haplotype-based GWAS (Genome-Wide Association Study). Users can test not only additive effects but also dominance and epistatic effects. In detail, please check our paper on PLOS Computational Biology: Kosuke Hamazaki and Hiroyoshi Iwata (2020) <doi:10.1371/journal.pcbi.1007663>.
Maintained by Kosuke Hamazaki. Last updated 3 months ago.
12.6 match 22 stars 5.99 score 22 scriptsropensci
rix:Reproducible Data Science Environments with 'Nix'
Simplifies the creation of reproducible data science environments using the 'Nix' package manager, as described in Dolstra (2006) <ISBN 90-393-4130-3>. The included `rix()` function generates a complete description of the environment as a `default.nix` file, which can then be built using 'Nix'. This results in project specific software environments with pinned versions of R, packages, linked system dependencies, and other tools. Additional helpers make it easy to run R code in 'Nix' software environments for testing and production.
Maintained by Bruno Rodrigues. Last updated 4 days ago.
nixpeer-reviewedreproducibilityreproducible-research
7.1 match 235 stars 10.54 score 67 scriptsopenpharma
simaerep:Find Clinical Trial Sites Under-Reporting Adverse Events
Monitoring of Adverse Event (AE) reporting in clinical trials is important for patient safety. Sites that are under-reporting AEs can be detected using Bootstrap-based simulations that simulate overall AE reporting. Based on the simulation an AE under-reporting probability is assigned to each site in a given trial (Koneswarakantha 2021 <doi:10.1007/s40264-020-01011-5>).
Maintained by Bjoern Koneswarakantha. Last updated 2 months ago.
14.4 match 22 stars 5.22 score 25 scriptskwstat
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 28 days ago.
6.8 match 125 stars 11.02 score 1.7k scripts 2 dependentslme4
lme4:Linear Mixed-Effects Models using 'Eigen' and S4
Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".
Maintained by Ben Bolker. Last updated 3 days ago.
3.6 match 647 stars 20.69 score 35k scripts 1.5k dependentsmanueleleonelli
bnmonitor:An Implementation of Sensitivity Analysis in Bayesian Networks
An implementation of sensitivity and robustness methods in Bayesian networks in R. It includes methods to perform parameter variations via a variety of co-variation schemes, to compute sensitivity functions and to quantify the dissimilarity of two Bayesian networks via distances and divergences. It further includes diagnostic methods to assess the goodness of fit of a Bayesian networks to data, including global, node and parent-child monitors. Reference: M. Leonelli, R. Ramanathan, R.L. Wilkerson (2022) <doi:10.1016/j.knosys.2023.110882>.
Maintained by Manuele Leonelli. Last updated 6 months ago.
18.8 match 3 stars 3.92 score 14 scriptskgoldfeld
simstudy:Simulation of Study Data
Simulates data sets in order to explore modeling techniques or better understand data generating processes. The user specifies a set of relationships between covariates, and generates data based on these specifications. The final data sets can represent data from randomized control trials, repeated measure (longitudinal) designs, and cluster randomized trials. Missingness can be generated using various mechanisms (MCAR, MAR, NMAR).
Maintained by Keith Goldfeld. Last updated 8 months ago.
data-generationdata-simulationsimulationstatistical-modelscpp
6.7 match 82 stars 11.00 score 972 scripts 1 dependentsacdelre
MAd:Meta-Analysis with Mean Differences
A collection of functions for conducting a meta-analysis with mean differences data. It uses recommended procedures as described in The Handbook of Research Synthesis and Meta-Analysis (Cooper, Hedges, & Valentine, 2009).
Maintained by AC Del Re. Last updated 3 years ago.
17.0 match 4.29 score 82 scripts 2 dependentszabore
riskclustr:Functions to Study Etiologic Heterogeneity
A collection of functions related to the study of etiologic heterogeneity both across disease subtypes and across individual disease markers. The included functions allow one to quantify the extent of etiologic heterogeneity in the context of a case-control study, and provide p-values to test for etiologic heterogeneity across individual risk factors. Begg CB, Zabor EC, Bernstein JL, Bernstein L, Press MF, Seshan VE (2013) <doi:10.1002/sim.5902>.
Maintained by Emily C. Zabor. Last updated 1 years ago.
15.0 match 1 stars 4.81 score 26 scriptsbioc
GWASTools:Tools for Genome Wide Association Studies
Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.
Maintained by Stephanie M. Gogarten. Last updated 5 months ago.
snpgeneticvariabilityqualitycontrolmicroarray
6.5 match 17 stars 10.50 score 396 scripts 5 dependentscovaruber
sommer:Solving Mixed Model Equations in R
Structural multivariate-univariate linear mixed model solver for estimation of multiple random effects with unknown variance-covariance structures (e.g., heterogeneous and unstructured) and known covariance among levels of random effects (e.g., pedigree and genomic relationship matrices) (Covarrubias-Pazaran, 2016 <doi:10.1371/journal.pone.0156744>; Maier et al., 2015 <doi:10.1016/j.ajhg.2014.12.006>; Jensen et al., 1997). REML estimates can be obtained using the Direct-Inversion Newton-Raphson and Direct-Inversion Average Information algorithms for the problems r x r (r being the number of records) or using the Henderson-based average information algorithm for the problem c x c (c being the number of coefficients to estimate). Spatial models can also be fitted using the two-dimensional spline functionality available.
Maintained by Giovanny Covarrubias-Pazaran. Last updated 22 days ago.
average-informationmixed-modelsrcpparmadilloopenblascppopenmp
5.4 match 43 stars 12.70 score 300 scripts 9 dependentsweiliang
powerSurvEpi:Power and Sample Size Calculation for Survival Analysis of Epidemiological Studies
Functions to calculate power and sample size for testing main effect or interaction effect in the survival analysis of epidemiological studies (non-randomized studies), taking into account the correlation between the covariate of the interest and other covariates. Some calculations also take into account the competing risks and stratified analysis. This package also includes a set of functions to calculate power and sample size for testing main effect in the survival analysis of randomized clinical trials and conditional logistic regression for nested case-control study.
Maintained by Weiliang Qiu. Last updated 4 years ago.
18.5 match 3.72 score 77 scripts 2 dependentslmaowisc
WR:Win Ratio Analysis of Composite Time-to-Event Outcomes
Implements various win ratio methodologies for composite endpoints of death and non-fatal events, including the (stratified) proportional win-fractions (PW) regression models (Mao and Wang, 2020 <doi:10.1111/biom.13382>), (stratified) two-sample tests with possibly recurrent nonfatal event, and sample size calculation for standard win ratio test (Mao et al., 2021 <doi:10.1111/biom.13501>).
Maintained by Lu Mao. Last updated 2 months ago.
11.2 match 6.11 score 43 scriptsbioc
canceR:A Graphical User Interface for accessing and modeling the Cancer Genomics Data of MSKCC
The package is user friendly interface based on the cgdsr and other modeling packages to explore, compare, and analyse all available Cancer Data (Clinical data, Gene Mutation, Gene Methylation, Gene Expression, Protein Phosphorylation, Copy Number Alteration) hosted by the Computational Biology Center at Memorial-Sloan-Kettering Cancer Center (MSKCC).
Maintained by Karim Mezhoud. Last updated 5 months ago.
guigeneexpressionclusteringgogenesetenrichmentkeggmultiplecomparisoncancercancer-datagenegene-expressiongene-methylationgene-mutationgene-setsmethylationmskccmutationstcltk
12.8 match 7 stars 5.25 score 17 scriptswviechtb
metafor:Meta-Analysis Package for R
A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.
Maintained by Wolfgang Viechtbauer. Last updated 1 days ago.
meta-analysismixed-effectsmultilevel-modelsmultivariate
4.1 match 246 stars 16.30 score 4.9k scripts 92 dependentspbiecek
PBImisc:A Set of Datasets Used in My Classes or in the Book 'Modele Liniowe i Mieszane w R, Wraz z Przykladami w Analizie Danych'
A set of datasets and functions used in the book 'Modele liniowe i mieszane w R, wraz z przykladami w analizie danych'. Datasets either come from real studies or are created to be as similar as possible to real studies.
Maintained by Przemyslaw Biecek. Last updated 8 years ago.
16.5 match 4.00 score 66 scripts 1 dependentsbioc
affycomp:Graphics Toolbox for Assessment of Affymetrix Expression Measures
The package contains functions that can be used to compare expression measures for Affymetrix Oligonucleotide Arrays.
Maintained by Robert D. Shear. Last updated 5 months ago.
onechannelmicroarraypreprocessing
11.1 match 5.92 score 14 scriptsbioc
cbaf:Automated functions for comparing various omic data from cbioportal.org
This package contains functions that allow analysing and comparing omic data across various cancers/cancer subgroups easily. So far, it is compatible with RNA-seq, microRNA-seq, microarray and methylation datasets that are stored on cbioportal.org.
Maintained by Arman Shahrisa. Last updated 5 months ago.
softwareassaydomaindnamethylationgeneexpressiontranscriptionmicroarrayresearchfieldbiomedicalinformaticscomparativegenomicsepigeneticsgeneticstranscriptomics
17.4 match 3.78 score 1 scriptssingmann
afex:Analysis of Factorial Experiments
Convenience functions for analyzing factorial experiments using ANOVA or mixed models. aov_ez(), aov_car(), and aov_4() allow specification of between, within (i.e., repeated-measures), or mixed (i.e., split-plot) ANOVAs for data in long format (i.e., one observation per row), automatically aggregating multiple observations per individual and cell of the design. mixed() fits mixed models using lme4::lmer() and computes p-values for all fixed effects using either Kenward-Roger or Satterthwaite approximation for degrees of freedom (LMM only), parametric bootstrap (LMMs and GLMMs), or likelihood ratio tests (LMMs and GLMMs). afex_plot() provides a high-level interface for interaction or one-way plots using ggplot2, combining raw data and model estimates. afex uses type 3 sums of squares as default (imitating commercial statistical software).
Maintained by Henrik Singmann. Last updated 7 months ago.
4.5 match 123 stars 14.50 score 1.4k scripts 15 dependentscran
kappaSize:Sample Size Estimation Functions for Studies of Interobserver Agreement
Contains basic tools for sample size estimation in studies of interobserver/interrater agreement (reliability). Includes functions for both the power-based and confidence interval-based methods, with binary or multinomial outcomes and two through six raters.
Maintained by Michael A Rotondi. Last updated 6 years ago.
22.8 match 3 stars 2.86 score 2 dependentsmrcieu
TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database
A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.
Maintained by Gibran Hemani. Last updated 10 days ago.
5.8 match 467 stars 11.23 score 1.7k scripts 1 dependentsmidfieldr
midfieldr:Tools and Methods for Working with MIDFIELD Data in 'R'
Provides tools and demonstrates methods for working with individual undergraduate student-level records (registrar's data) in 'R'. Tools include filters for program codes, data sufficiency, and timely completion. Methods include gathering blocs of records, computing quantitative metrics such as graduation rate, and creating charts to visualize comparisons. 'midfieldr' interacts with practice data provided in 'midfielddata', an R data package available at <https://midfieldr.github.io/midfielddata/>. 'midfieldr' also interacts with the full MIDFIELD database for users who have access. This work is supported by the US National Science Foundation through grant numbers 1545667 and 2142087.
Maintained by Richard Layton. Last updated 2 months ago.
11.6 match 2 stars 5.56 score 26 scriptsbioc
cypress:Cell-Type-Specific Power Assessment
CYPRESS is a cell-type-specific power tool. This package aims to perform power analysis for the cell-type-specific data. It calculates FDR, FDC, and power, under various study design parameters, including but not limited to sample size, and effect size. It takes the input of a SummarizeExperimental(SE) object with observed mixture data (feature by sample matrix), and the cell-type mixture proportions (sample by cell-type matrix). It can solve the cell-type mixture proportions from the reference free panel from TOAST and conduct tests to identify cell-type-specific differential expression (csDE) genes.
Maintained by Shilin Yu. Last updated 5 months ago.
softwaregeneexpressiondataimportrnaseqsequencing
17.4 match 1 stars 3.70 score 2 scriptskogalur
randomForestSRC:Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)
Fast OpenMP parallel computing of Breiman's random forests for univariate, multivariate, unsupervised, survival, competing risks, class imbalanced classification and quantile regression. New Mahalanobis splitting for correlated outcomes. Extreme random forests and randomized splitting. Suite of imputation methods for missing data. Fast random forests using subsampling. Confidence regions and standard errors for variable importance. New improved holdout importance. Case-specific importance. Minimal depth variable importance. Visualize trees on your Safari or Google Chrome browser. Anonymous random forests for data privacy.
Maintained by Udaya B. Kogalur. Last updated 2 months ago.
8.1 match 10 stars 7.90 score 1.2k scripts 12 dependentsbioc
ssrch:a simple search engine
Demonstrate tokenization and a search gadget for collections of CSV files.
Maintained by VJ Carey. Last updated 5 months ago.
17.7 match 3.60 score 20 scriptsobiba
micar:'Mica' Data Web Portal Client
'Mica' is a server application used to create data web portals for large-scale epidemiological studies or multiple-study consortia. 'Mica' helps studies to provide scientifically robust data visibility and web presence without significant information technology effort. 'Mica' provides a structured description of consortia, studies, annotated and searchable data dictionaries, and data access request management. This 'Mica' client allows to perform data extraction for reporting purposes.
Maintained by Yannick Marcon. Last updated 10 months ago.
13.3 match 4.78 score 30 scriptsopenml
OpenML:Open Machine Learning and Open Data Platform
We provide an R interface to 'OpenML.org' which is an online machine learning platform where researchers can access open data, download and upload data sets, share their machine learning tasks and experiments and organize them online to work and collaborate with other researchers. The R interface allows to query for data sets with specific properties, and allows the downloading and uploading of data sets, tasks, flows and runs. See <https://www.openml.org/guide/api> for more information.
Maintained by Giuseppe Casalicchio. Last updated 10 months ago.
arffbenchmarkingbenchmarking-suiteclassificationdata-sciencedatabasedatasetdatasetsmachine-learningmachine-learning-algorithmsopen-dataopen-scienceopendataopenmlopenscienceregressionreproducible-researchstatistics
5.8 match 97 stars 11.04 score 7.1k scriptsprojectmosaic
mosaicData:Project MOSAIC Data Sets
Data sets from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.
Maintained by Randall Pruim. Last updated 1 years ago.
7.6 match 6 stars 8.33 score 632 scripts 8 dependentspharmaverse
admiral:ADaM in R Asset Library
A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).
Maintained by Ben Straub. Last updated 4 days ago.
cdiscclinical-trialsopen-source
4.5 match 236 stars 13.89 score 486 scripts 4 dependentswjbraun
DAAG:Data Analysis and Graphics Data and Functions
Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.
Maintained by W. John Braun. Last updated 11 months ago.
7.6 match 8.25 score 1.2k scripts 1 dependentstsailintung
fastdid:Fast Staggered Difference-in-Difference Estimators
A fast and flexible implementation of Callaway and Sant'Anna's (2021)<doi:10.1016/j.jeconom.2020.12.001> staggered Difference-in-Differences (DiD) estimators, 'fastdid' reduces the computation time from hours to seconds, and incorporates extensions such as time-varying covariates and multiple events.
Maintained by Lin-Tung Tsai. Last updated 4 months ago.
difference-in-differencesevent-studystaggered-did
9.5 match 27 stars 6.58 score 4 scriptshputter
mstate:Data Preparation, Estimation and Prediction in Multi-State Models
Contains functions for data preparation, descriptives, hazard estimation and prediction with Aalen-Johansen or simulation in competing risks and multi-state models, see Putter, Fiocco, Geskus (2007) <doi:10.1002/sim.2712>.
Maintained by Hein Putter. Last updated 28 days ago.
5.1 match 11 stars 12.13 score 322 scripts 55 dependentsthothorn
TH.data:TH's Data Archive
Contains data sets used in other packages Torsten Hothorn maintains.
Maintained by Torsten Hothorn. Last updated 2 months ago.
7.5 match 8.28 score 137 scripts 370 dependentscran
PairedData:Paired Data Analysis
Many datasets and a set of graphics (based on ggplot2), statistics, effect sizes and hypothesis tests are provided for analysing paired data with S4 class.
Maintained by Stephane Champely. Last updated 7 years ago.
11.9 match 2 stars 5.18 score 326 scripts 4 dependentssmartdata-analysis-and-statistics
metamisc:Meta-Analysis of Diagnosis and Prognosis Research Studies
Facilitate frequentist and Bayesian meta-analysis of diagnosis and prognosis research studies. It includes functions to summarize multiple estimates of prediction model discrimination and calibration performance (Debray et al., 2019) <doi:10.1177/0962280218785504>. It also includes functions to evaluate funnel plot asymmetry (Debray et al., 2018) <doi:10.1002/jrsm.1266>. Finally, the package provides functions for developing multivariable prediction models from datasets with clustering (de Jong et al., 2021) <doi:10.1002/sim.8981>.
Maintained by Thomas Debray. Last updated 1 months ago.
meta-analysisprognosisprognostic-models
8.2 match 7 stars 7.48 score 102 scriptsnanxstats
hdnom:Benchmarking and Visualization Toolkit for Penalized Cox Models
Creates nomogram visualizations for penalized Cox regression models, with the support of reproducible survival model building, validation, calibration, and comparison for high-dimensional data.
Maintained by Nan Xiao. Last updated 6 months ago.
benchmarkhigh-dimensional-datalinear-regressionnomogram-visualizationpenalized-cox-modelssurvival-analysisopenblas
7.5 match 43 stars 8.07 score 68 scripts 1 dependentsbioc
cbpManager:Generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics
This R package provides an R Shiny application that enables the user to generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics. Create cancer studies and edit its metadata. Upload mutation data of a patient that will be concatenated to the data_mutation_extended.txt file of the study. Create and edit clinical patient data, sample data, and timeline data. Create custom timeline tracks for patients.
Maintained by Arsenij Ustjanzew. Last updated 5 months ago.
immunooncologydataimportdatarepresentationguithirdpartyclientpreprocessingvisualizationcancer-genomicscbioportalclinical-datafilegeneratormutation-datapatient-data
10.9 match 8 stars 5.51 score 1 scriptsddsjoberg
gtsummary:Presentation-Ready Data Summary and Analytic Result Tables
Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.
Maintained by Daniel D. Sjoberg. Last updated 2 days ago.
easy-to-usegthtml5regression-modelsreproducibilityreproducible-researchstatisticssummary-statisticssummary-tablestable1tableone
3.5 match 1.1k stars 17.00 score 8.2k scripts 15 dependentsbioc
Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data
The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.
Maintained by Matteo Tiberti. Last updated 2 months ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment
9.0 match 5 stars 6.59 score 43 scriptsbiogenies
countfitteR:Comprehensive Automatized Evaluation of Distribution Models for Count Data
A large number of measurements generate count data. This is a statistical data type that only assumes non-negative integer values and is generated by counting. Typically, counting data can be found in biomedical applications, such as the analysis of DNA double-strand breaks. The number of DNA double-strand breaks can be counted in individual cells using various bioanalytical methods. For diagnostic applications, it is relevant to record the distribution of the number data in order to determine their biomedical significance (Roediger, S. et al., 2018. Journal of Laboratory and Precision Medicine. <doi:10.21037/jlpm.2018.04.10>). The software offers functions for a comprehensive automated evaluation of distribution models of count data. In addition to programmatic interaction, a graphical user interface (web server) is included, which enables fast and interactive data-scientific analyses. The user is supported in selecting the most suitable counting distribution for his own data set.
Maintained by Jaroslaw Chilimoniuk. Last updated 2 years ago.
cancercancer-imaging-researchcount-datacount-distributionfoci
11.1 match 4 stars 5.33 score 27 scriptsconjugateprior
cbn:Tools and replication materials for Caliskan, Bryson, and Narayanan (2017)
This package allows users to replicate the analysis in the paper and also provides general purpose tools for working with a large word vector file and comparing groups of words with permutation statistics from the original paper. Alternative bootstrapped versions with confidence intervals are also available.
Maintained by Will Lowe. Last updated 6 years ago.
17.0 match 2 stars 3.48 score 6 scriptskeaven
gsDesign:Group Sequential Design
Derives group sequential clinical trial designs and describes their properties. Particular focus on time-to-event, binary, and continuous outcomes. Largely based on methods described in Jennison, Christopher and Turnbull, Bruce W., 2000, "Group Sequential Methods with Applications to Clinical Trials" ISBN: 0-8493-0316-8.
Maintained by Keaven Anderson. Last updated 12 days ago.
biostatisticsboundariesclinical-trialsdesignspending-functions
4.5 match 51 stars 13.05 score 338 scripts 5 dependentsmeyer-lab-cshl
plinkQC:Genotype Quality Control with 'PLINK'
Genotyping arrays enable the direct measurement of an individuals genotype at thousands of markers. 'plinkQC' facilitates genotype quality control for genetic association studies as described by Anderson and colleagues (2010) <doi:10.1038/nprot.2010.116>. It makes 'PLINK' basic statistics (e.g. missing genotyping rates per individual, allele frequencies per genetic marker) and relationship functions accessible from 'R' and generates a per-individual and per-marker quality control report. Individuals and markers that fail the quality control can subsequently be removed to generate a new, clean dataset. Removal of individuals based on relationship status is optimised to retain as many individuals as possible in the study.
Maintained by Hannah Meyer. Last updated 3 years ago.
8.5 match 58 stars 6.75 score 49 scriptsflorale
multilevelcoda:Estimate Bayesian Multilevel Models for Compositional Data
Implement Bayesian Multilevel Modelling for compositional data in a multilevel framework. Compute multilevel compositional data and Isometric log ratio (ILR) at between and within-person levels, fit Bayesian multilevel models for compositional predictors and outcomes, and run post-hoc analyses such as isotemporal substitution models. References: Le, Stanford, Dumuid, and Wiley (2024) <doi:10.48550/arXiv.2405.03985>, Le, Dumuid, Stanford, and Wiley (2024) <doi:10.48550/arXiv.2411.12407>.
Maintained by Flora Le. Last updated 3 days ago.
bayesian-inferencecompositional-data-analysismultilevel-modelsmultilevelcoda
6.9 match 14 stars 8.31 score 118 scriptscran
BE:Bioequivalence Study Data Analysis
Analyze bioequivalence study data with industrial strength. Sample size could be determined for various crossover designs, such as 2x2 design, 2x4 design, 4x4 design, Balaam design, Two-sequence dual design, and William design. Reference: Chow SC, Liu JP. Design and Analysis of Bioavailability and Bioequivalence Studies. 3rd ed. (2009, ISBN:978-1-58488-668-6).
Maintained by Kyun-Seop Bae. Last updated 2 years ago.
21.4 match 1 stars 2.63 score 43 scriptsmpio-be
rangeMapper:A Platform for the Study of Macro-Ecology of Life History Traits
Tools for generation of (life-history) traits and diversity maps on hexagonal or square grids. Valcu et al.(2012) <doi:10.1111/j.1466-8238.2011.00739.x>.
Maintained by Mihai Valcu. Last updated 2 years ago.
assemblage-levelecologygloballife-history-traitsraster-cellspecies
10.4 match 8 stars 5.38 score 30 scriptstagteam
pec:Prediction Error Curves for Risk Prediction Models in Survival Analysis
Validation of risk predictions obtained from survival models and competing risk models based on censored data using inverse weighting and cross-validation. Most of the 'pec' functionality has been moved to 'riskRegression'.
Maintained by Thomas A. Gerds. Last updated 2 years ago.
7.5 match 7.42 score 512 scripts 26 dependentsbioc
MSstats:Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments
A set of tools for statistical relative protein significance analysis in DDA, SRM and DIA experiments.
Maintained by Meena Choi. Last updated 11 days ago.
immunooncologymassspectrometryproteomicssoftwarenormalizationqualitycontroltimecourseopenblascpp
6.6 match 8.49 score 164 scripts 7 dependentsbioc
HIBAG:HLA Genotype Imputation with Attribute Bagging
Imputes HLA classical alleles using GWAS SNP data, and it relies on a training set of HLA and SNP genotypes. HIBAG can be used by researchers with published parameter estimates instead of requiring access to large training sample datasets. It combines the concepts of attribute bagging, an ensemble classifier method, with haplotype inference for SNPs and HLA types. Attribute bagging is a technique which improves the accuracy and stability of classifier ensembles using bootstrap aggregating and random variable selection.
Maintained by Xiuwen Zheng. Last updated 4 months ago.
geneticsstatisticalmethodbioinformaticsgpuhlaimputationmhcsnpcpp
6.8 match 30 stars 8.24 score 48 scriptsmarselscheer
simTool:Conduct Simulation Studies with a Minimal Amount of Source Code
Tool for statistical simulations that have two components. One component generates the data and the other one analyzes the data. The main aims of the package are the reduction of the administrative source code (mainly loops and management code for the results) and a simple applicability of the package that allows the user to quickly learn how to work with it. Parallel computing is also supported. Finally, convenient functions are provided to summarize the simulation results.
Maintained by Marsel Scheer. Last updated 4 years ago.
facilitates-simulation-studiessimulation
11.5 match 6 stars 4.78 score 20 scriptsepiforecasts
EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.
Maintained by Sebastian Funk. Last updated 25 days ago.
backcalculationcovid-19gaussian-processesopen-sourcereproduction-numberstancpp
4.6 match 120 stars 11.88 score 210 scriptsbarakbri
repfdr:Replicability Analysis for Multiple Studies of High Dimension
Estimation of Bayes and local Bayes false discovery rates for replicability analysis (Heller & Yekutieli, 2014 <doi:10.1214/13-AOAS697> ; Heller at al., 2015 <doi: 10.1093/bioinformatics/btu434>).
Maintained by Ruth Heller. Last updated 7 years ago.
10.9 match 3 stars 4.98 score 16 scriptsjenniniku
gllvm:Generalized Linear Latent Variable Models
Analysis of multivariate data using generalized linear latent variable models (gllvm). Estimation is performed using either the Laplace method, variational approximations, or extended variational approximations, implemented via TMB (Kristensen et al. (2016), <doi:10.18637/jss.v070.i05>).
Maintained by Jenni Niku. Last updated 8 hours ago.
5.2 match 52 stars 10.53 score 176 scripts 1 dependentsnicokubi
penetrance:Methods for Penetrance Estimation in Family-Based Studies
Implements statistical methods for estimating disease penetrance in family-based studies. Penetrance refers to the probability of disease§ manifestation in individuals carrying specific genetic variants. The package provides tools for age-specific penetrance estimation, handling missing data, and accounting for ascertainment bias in family studies. Cite as: Kubista, N., Braun, D. & Parmigiani, G. (2024) <doi:10.48550/arXiv.2411.18816>.
Maintained by Nicolas Kubista. Last updated 17 days ago.
10.0 match 5.41 scoremrcieu
MRInstruments:Data sources for genetic instruments to be used in MR
Datasets of eQTLs, GWAS catalogs, etc.
Maintained by Gibran Hemani. Last updated 5 years ago.
10.4 match 44 stars 5.15 score 212 scriptssanfordweisberg
alr4:Data to Accompany Applied Linear Regression 4th Edition
Datasets to Accompany S. Weisberg (2014, ISBN: 978-1-118-38608-8), "Applied Linear Regression," 4th edition. Many data files in this package are included in the `alr3` package as well, so only one of them should be used.
Maintained by Sanford Weisberg. Last updated 7 years ago.
15.5 match 1 stars 3.45 score 306 scriptsamirfeizi
otargen:Access Open Target Genetics
Interact seamlessly with Open Target Genetics' GraphQL endpoint to query and retrieve tidy data tables, facilitating the analysis of genetic data. For more information about the Open Target Genetics API (<https://genetics.opentargets.org/api>).
Maintained by Amir Feizi. Last updated 5 months ago.
12.0 match 9 stars 4.43 score 3 scriptsgloewing
studyStrap:Study Strap and Multi-Study Learning Algorithms
Implements multi-study learning algorithms such as merging, the study-specific ensemble (trained-on-observed-studies ensemble) the study strap, the covariate-matched study strap, covariate-profile similarity weighting, and stacking weights. Embedded within the 'caret' framework, this package allows for a wide range of single-study learners (e.g., neural networks, lasso, random forests). The package offers over 20 default similarity measures and allows for specification of custom similarity measures for covariate-profile similarity weighting and an accept/reject step. This implements methods described in Loewinger, Kishida, Patil, and Parmigiani. (2019) <doi:10.1101/856385>.
Maintained by Gabriel Loewinger. Last updated 5 years ago.
26.5 match 2.00 score 2 scriptsbioc
msImpute:Imputation of label-free mass spectrometry peptides
MsImpute is a package for imputation of peptide intensity in proteomics experiments. It additionally contains tools for MAR/MNAR diagnosis and assessment of distortions to the probability distribution of the data post imputation. The missing values are imputed by low-rank approximation of the underlying data matrix if they are MAR (method = "v2"), by Barycenter approach if missingness is MNAR ("v2-mnar"), or by Peptide Identity Propagation (PIP).
Maintained by Soroor Hediyeh-zadeh. Last updated 5 months ago.
massspectrometryproteomicssoftwarelabel-free-proteomicslow-rank-approximation
10.2 match 14 stars 5.15 score 7 scriptshugofitipaldi
covidsymptom:COVID Symptom Study Sweden Open Dataset
The COVID Symptom Study is a non-commercial project that uses a free mobile app to facilitate real-time data collection of symptoms, exposures, and risk factors related to COVID19. The package allows easy access to summary statistics data from COVID Symptom Study Sweden.
Maintained by Hugo Fitipaldi. Last updated 11 months ago.
12.2 match 4 stars 4.30 score 7 scriptsthothorn
exactRankTests:Exact Distributions for Rank and Permutation Tests
Computes exact conditional p-values and quantiles using an implementation of the Shift-Algorithm by Streitberg & Roehmel.
Maintained by Torsten Hothorn. Last updated 3 years ago.
7.3 match 1 stars 7.13 score 276 scripts 65 dependentsbioc
sizepower:Sample Size and Power Calculation in Micorarray Studies
This package has been prepared to assist users in computing either a sample size or power value for a microarray experimental study. The user is referred to the cited references for technical background on the methodology underpinning these calculations. This package provides support for five types of sample size and power calculations. These five types can be adapted in various ways to encompass many of the standard designs encountered in practice.
Maintained by Weiliang Qiu. Last updated 5 months ago.
14.5 match 3.60 score 3 scriptsepiverse-trace
vaccineff:Estimate Vaccine Effectiveness Based on Different Study Designs
Provides tools for estimating vaccine effectiveness and related metrics. The 'vaccineff_data' class manages key features for preparing, visualizing, and organizing cohort data, as well as estimating vaccine effectiveness. The results and model performance are assessed using the 'vaccineff' class.
Maintained by Zulma M. Cucunubá. Last updated 18 days ago.
epidemiologyepiversevaccine-effectiveness
6.6 match 16 stars 7.93 score 13 scriptswuqian77
TrialSize:R Functions for Chapter 3,4,6,7,9,10,11,12,14,15 of Sample Size Calculation in Clinical Research
Functions and Examples in Sample Size Calculation in Clinical Research.
Maintained by Vicky Qian Wu. Last updated 4 months ago.
13.8 match 3 stars 3.78 score 95 scripts 1 dependentsbioc
PLSDAbatch:PLSDA-batch
A novel framework to correct for batch effects prior to any downstream analysis in microbiome data based on Projection to Latent Structures Discriminant Analysis. The main method is named “PLSDA-batch”. It first estimates treatment and batch variation with latent components, then subtracts batch-associated components from the data whilst preserving biological variation of interest. PLSDA-batch is highly suitable for microbiome data as it is non-parametric, multivariate and allows for ordination and data visualisation. Combined with centered log-ratio transformation for addressing uneven library sizes and compositional structure, PLSDA-batch addresses all characteristics of microbiome data that existing correction methods have ignored so far. Two other variants are proposed for 1/ unbalanced batch x treatment designs that are commonly encountered in studies with small sample sizes, and for 2/ selection of discriminative variables amongst treatment groups to avoid overfitting in classification problems. These two variants have widened the scope of applicability of PLSDA-batch to different data settings.
Maintained by Yiwen (Eva) Wang. Last updated 5 months ago.
statisticalmethoddimensionreductionprincipalcomponentclassificationmicrobiomebatcheffectnormalizationvisualization
9.6 match 13 stars 5.37 score 18 scriptsalicepaul
HDSinRdata:Data for the 'Mastering Health Data Science Using R' Online Textbook
Contains ten datasets used in the chapters and exercises of Paul, Alice (2023) "Health Data Science in R" <https://alicepaul.github.io/health-data-science-using-r/>.
Maintained by Alice Paul. Last updated 3 months ago.
12.6 match 1 stars 4.09 score 41 scriptssokbae
ciccr:Causal Inference in Case-Control and Case-Population Studies
Estimation and inference methods for causal relative and attributable risk in case-control and case-population studies under the monotone treatment response and monotone treatment selection assumptions. For more details, see the paper by Jun and Lee (2023), "Causal Inference under Outcome-Based Sampling with Monotonicity Assumptions," <arXiv:2004.08318 [econ.EM]>, accepted for publication in Journal of Business & Economic Statistics.
Maintained by Sokbae Lee. Last updated 1 years ago.
case-control-studiescausal-inferencepartial-identificationtreatment-effects
12.8 match 2 stars 4.00 score 4 scriptsbiometris
isatabr:Implementation for the ISA Abstract Model
ISA is a metadata framework to manage an increasingly diverse set of life science, environmental and biomedical experiments. In isatabr methods for reading, modifying and writing of files in the ISA-Tab format are implemented. It also contains methods for processing assay data.
Maintained by Bart-Jan van Rossum. Last updated 2 years ago.
13.9 match 3.70 score 2 scriptsdetlew
Power2Stage:Power and Sample-Size Distribution of 2-Stage Bioequivalence Studies
Contains functions to obtain the operational characteristics of bioequivalence studies in Two-Stage Designs (TSD) via simulations.
Maintained by Detlew Labes. Last updated 1 years ago.
16.4 match 1 stars 3.11 score 13 scriptscvoeten
buildmer:Stepwise Elimination and Term Reordering for Mixed-Effects Regression
Finds the largest possible regression model that will still converge for various types of regression analyses (including mixed models and generalized additive models) and then optionally performs stepwise elimination similar to the forward and backward effect-selection methods in SAS, based on the change in log-likelihood or its significance, Akaike's Information Criterion, the Bayesian Information Criterion, the explained deviance, or the F-test of the change in R².
Maintained by Cesko C. Voeten. Last updated 1 years ago.
8.8 match 5.82 score 200 scriptspamelarussell
TCIApathfinder:Client for the Cancer Imaging Archive REST API
A wrapper for The Cancer Imaging Archive's REST API. The Cancer Imaging Archive (TCIA) hosts de-identified medical images of cancer available for public download, as well as rich metadata for each image series. TCIA provides a REST API for programmatic access to the data. This package provides simple functions to access each API endpoint. For more information, see <https://github.com/pamelarussell/TCIApathfinder> and TCIA's website.
Maintained by Pamela Russell. Last updated 4 years ago.
8.9 match 9 stars 5.70 score 28 scriptsskoestlmeier
crseEventStudy:A Robust and Powerful Test of Abnormal Stock Returns in Long-Horizon Event Studies
Based on Dutta et al. (2018) <doi:10.1016/j.jempfin.2018.02.004>, this package provides their standardized test for abnormal returns in long-horizon event studies. The methods used improve the major weaknesses of size, power, and robustness of long-run statistical tests described in Kothari/Warner (2007) <doi:10.1016/B978-0-444-53265-7.50015-9>. Abnormal returns are weighted by their statistical precision (i.e., standard deviation), resulting in abnormal standardized returns. This procedure efficiently captures the heteroskedasticity problem. Clustering techniques following Cameron et al. (2011) <doi:10.1198/jbes.2010.07136> are adopted for computing cross-sectional correlation robust standard errors. The statistical tests in this package therefore accounts for potential biases arising from returns' cross-sectional correlation, autocorrelation, and volatility clustering without power loss.
Maintained by Siegfried Köstlmeier. Last updated 3 years ago.
empirical-researchevent-studyfinancefinancial-analysis
15.8 match 2 stars 3.20 score 16 scriptsbioc
XDE:XDE: a Bayesian hierarchical model for cross-study analysis of differential gene expression
Multi-level model for cross-study detection of differential gene expression.
Maintained by Robert Scharpf. Last updated 5 months ago.
microarraydifferentialexpressioncpp
12.0 match 4.20 score 10 scriptsmayoverse
arsenal:An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.
Maintained by Ethan Heinzen. Last updated 7 months ago.
baseline-characteristicsdescriptive-statisticsmodelingpaired-comparisonsreportingstatisticstableone
3.8 match 225 stars 13.45 score 1.2k scripts 16 dependentsthlytras
miniMeta:Web Application to Run Meta-Analyses
Shiny web application to run meta-analyses. Essentially a graphical front-end to package 'meta' for R. Can be useful as an educational tool, and for quickly analyzing and sharing meta-analyses. Provides output to quickly fill in GRADE (Grading of Recommendations, Assessment, Development and Evaluations) Summary-of-Findings tables. Importantly, it allows further processing of the results inside R, in case more specific analyses are needed.
Maintained by Theodore Lytras. Last updated 9 months ago.
meta-analysesmeta-analysisobservational-studiesrandomized-controlled-trialssample-size-calculationshiny
10.7 match 5 stars 4.70 score 3 scriptsrkoenker
quantreg:Quantile Regression
Estimation and inference methods for models for conditional quantile functions: Linear and nonlinear parametric and non-parametric (total variation penalized) models for conditional quantiles of a univariate response and several methods for handling censored survival data. Portfolio selection methods based on expected shortfall risk are also now included. See Koenker, R. (2005) Quantile Regression, Cambridge U. Press, <doi:10.1017/CBO9780511754098> and Koenker, R. et al. (2017) Handbook of Quantile Regression, CRC Press, <doi:10.1201/9781315120256>.
Maintained by Roger Koenker. Last updated 6 days ago.
3.6 match 18 stars 13.93 score 2.6k scripts 1.5k dependentsrstudio
promises:Abstractions for Promise-Based Asynchronous Programming
Provides fundamental abstractions for doing asynchronous programming in R using promises. Asynchronous programming is useful for allowing a single R process to orchestrate multiple tasks in the background while also attending to something else. Semantics are similar to 'JavaScript' promises, but with a syntax that is idiomatic R.
Maintained by Joe Cheng. Last updated 1 months ago.
2.9 match 204 stars 17.10 score 688 scripts 2.6k dependentsopenanalytics
clinUtils:General Utility Functions for Analysis of Clinical Data
Utility functions to facilitate the import, the reporting and analysis of clinical data. Example datasets in 'SDTM' and 'ADaM' format, containing a subset of patients/domains from the 'CDISC Pilot 01 study' are also available as R datasets to demonstrate the package functionalities.
Maintained by Laure Cougnaud. Last updated 10 months ago.
7.3 match 3 stars 6.78 score 105 scripts 3 dependentsbioc
LEA:LEA: an R package for Landscape and Ecological Association Studies
LEA is an R package dedicated to population genomics, landscape genomics and genotype-environment association tests. LEA can run analyses of population structure and genome-wide tests for local adaptation, and also performs imputation of missing genotypes. The package includes statistical methods for estimating ancestry coefficients from large genotypic matrices and for evaluating the number of ancestral populations (snmf). It performs statistical tests using latent factor mixed models for identifying genetic polymorphisms that exhibit association with environmental gradients or phenotypic traits (lfmm2). In addition, LEA computes values of genetic offset statistics based on new or predicted environments (genetic.gap, genetic.offset). LEA is mainly based on optimized programs that can scale with the dimensions of large data sets.
Maintained by Olivier Francois. Last updated 5 days ago.
softwarestatistical methodclusteringregressionopenblas
7.4 match 6.63 score 534 scriptsrevelle
psychTools:Tools to Accompany the 'psych' Package for Psychological Research
Support functions, data sets, and vignettes for the 'psych' package. Contains several of the biggest data sets for the 'psych' package as well as four vignettes. A few helper functions for file manipulation are included as well. For more information, see the <https://personality-project.org/r/> web page.
Maintained by William Revelle. Last updated 12 months ago.
8.2 match 5.89 score 178 scripts 5 dependentslightbluetitan
OncoDataSets:A Comprehensive Collection of Cancer Types and Cancer-related DataSets
Offers a rich collection of data focused on cancer research, covering survival rates, genetic studies, biomarkers, and epidemiological insights. Designed for researchers, analysts, and bioinformatics practitioners, the package includes datasets on various cancer types such as melanoma, leukemia, breast, ovarian, and lung cancer, among others. It aims to facilitate advanced research, analysis, and understanding of cancer epidemiology, genetics, and treatment outcomes.
Maintained by Renzo Caceres Rossi. Last updated 3 months ago.
11.5 match 3 stars 4.18 score 6 scriptsmeghapsimatrix
simhelpers:Helper Functions for Simulation Studies
Calculates performance criteria measures and associated Monte Carlo standard errors for simulation results. Includes functions to help run simulation studies, following a general simulation workflow that closely aligns with the approach described by Morris, White, and Crowther (2019) <DOI:10.1002/sim.8086>. Also includes functions for calculating bootstrap confidence intervals (including normal, basic, studentized, percentile, bias-corrected, and bias-corrected-and-accelerated) with tidy output, as well as for extrapolating confidence interval coverage rates and hypothesis test rejection rates following techniques suggested by Boos and Zhang (2000) <DOI:10.1080/01621459.2000.10474226>.
Maintained by Megha Joshi. Last updated 2 months ago.
6.8 match 11 stars 7.07 score 40 scriptsstatist7
sitar:Super Imposition by Translation and Rotation Growth Curve Analysis
Functions for fitting and plotting SITAR (Super Imposition by Translation And Rotation) growth curve models. SITAR is a shape-invariant model with a regression B-spline mean curve and subject-specific random effects on both the measurement and age scales. The model was first described by Lindstrom (1995) <doi:10.1002/sim.4780141807> and developed as the SITAR method by Cole et al (2010) <doi:10.1093/ije/dyq115>.
Maintained by Tim Cole. Last updated 2 months ago.
5.5 match 13 stars 8.69 score 58 scripts 3 dependentspavlakrotka
NCC:Simulation and Analysis of Platform Trials with Non-Concurrent Controls
Design and analysis of flexible platform trials with non-concurrent controls. Functions for data generation, analysis, visualization and running simulation studies are provided. The implemented analysis methods are described in: Bofill Roig et al. (2022) <doi:10.1186/s12874-022-01683-w>, Saville et al. (2022) <doi:10.1177/17407745221112013> and Schmidli et al. (2014) <doi:10.1111/biom.12242>.
Maintained by Pavla Krotka. Last updated 6 days ago.
clinical-trialsplatform-trialssimulationstatistical-inferencejagscpp
7.1 match 5 stars 6.64 score 29 scriptslightbluetitan
educationR:A Comprehensive Collection of Educational Datasets
Provides a comprehensive collection of datasets related to education, covering topics such as student performance, learning methods, test scores, absenteeism, and other educational metrics. This package is designed as a resource for educational researchers, data analysts, and statisticians to explore and analyze data in the field of education.
Maintained by Renzo Caceres Rossi. Last updated 3 months ago.
11.0 match 4 stars 4.30 score 3 scriptsjonasmoss
publipha:Bayesian Meta-Analysis with Publications Bias and P-Hacking
Tools for Bayesian estimation of meta-analysis models that account for publications bias or p-hacking. For publication bias, this package implements a variant of the p-value based selection model of Hedges (1992) <doi:10.1214/ss/1177011364> with discrete selection probabilities. It also implements the mixture of truncated normals model for p-hacking described in Moss and De Bin (2019) <arXiv:1911.12445>.
Maintained by Jonas Moss. Last updated 2 years ago.
14.8 match 3 stars 3.18 score 3 scriptsrunehaubo
lmerTest:Tests in Linear Mixed Effects Models
Provides p-values in type I, II or III anova and summary tables for lmer model fits (cf. lme4) via Satterthwaite's degrees of freedom method. A Kenward-Roger method is also available via the pbkrtest package. Model selection methods include step, drop1 and anova-like tables for random effects (ranova). Methods for Least-Square means (LS-means) and tests of linear contrasts of fixed effects are also available.
Maintained by Rune Haubo Bojesen Christensen. Last updated 4 years ago.
3.6 match 51 stars 13.00 score 13k scripts 90 dependentsvladimirholy
gasmodel:Generalized Autoregressive Score Models
Estimation, forecasting, and simulation of generalized autoregressive score (GAS) models of Creal, Koopman, and Lucas (2013) <doi:10.1002/jae.1279> and Harvey (2013) <doi:10.1017/cbo9781139540933>. Model specification allows for various data types and distributions, different parametrizations, exogenous variables, joint and separate modeling of exogenous variables and dynamics, higher score and autoregressive orders, custom and unconditional initial values of time-varying parameters, fixed and bounded values of coefficients, and missing values. Model estimation is performed by the maximum likelihood method.
Maintained by Vladimír Holý. Last updated 1 years ago.
8.6 match 14 stars 5.45 score 2 scriptsrunehaubo
ordinal:Regression Models for Ordinal Data
Implementation of cumulative link (mixed) models also known as ordered regression models, proportional odds models, proportional hazards models for grouped survival times and ordered logit/probit/... models. Estimation is via maximum likelihood and mixed models are fitted with the Laplace approximation and adaptive Gauss-Hermite quadrature. Multiple random effect terms are allowed and they may be nested, crossed or partially nested/crossed. Restrictions of symmetry and equidistance can be imposed on the thresholds (cut-points/intercepts). Standard model methods are available (summary, anova, drop-methods, step, confint, predict etc.) in addition to profile methods and slice methods for visualizing the likelihood function and checking convergence.
Maintained by Rune Haubo Bojesen Christensen. Last updated 3 months ago.
3.8 match 38 stars 12.41 score 1.3k scripts 178 dependentsbioc
RedeR:Interactive visualization and manipulation of nested networks
RedeR is an R-based package combined with a stand-alone Java application for interactive visualization and manipulation of nested networks. Graph, node, and edge attributes can be configured using either graphical or command-line methods, following igraph syntax rules.
Maintained by Mauro Castro. Last updated 5 months ago.
guigraphandnetworknetworknetworkenrichmentnetworkinferencesoftwaresystemsbiology
6.9 match 6.65 score 107 scripts 7 dependentsjamesramsay5
fda:Functional Data Analysis
These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <https://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/>.
Maintained by James Ramsay. Last updated 4 months ago.
3.8 match 3 stars 12.29 score 2.0k scripts 143 dependentstraitecoevo
hmde:Hierarchical Methods for Differential Equations
Wrapper for Stan that offers a number of in-built models to implement a hierarchical Bayesian longitudinal model for repeat observation data. Model choice selects the differential equation that is fit to the observations. Single and multi-individual models are available.
Maintained by Tess OBrien. Last updated 1 months ago.
bayesian-inverse-problemsbayesian-methodsdifferential-equationshierarchical-modelsrstanstancpp
8.3 match 3 stars 5.53 score 10 scriptsrrwen
nbc4va:Bayes Classifier for Verbal Autopsy Data
An implementation of the Naive Bayes Classifier (NBC) algorithm used for Verbal Autopsy (VA) built on code from Miasnikof et al (2015) <DOI:10.1186/s12916-015-0521-2>.
Maintained by Richard Wen. Last updated 3 years ago.
autopsybayescauseclassifiercodedcomputerdeathestimateimputationlearningmachinemdsmillionnaivenbcprobabilitystudytheoryvaverbal
10.0 match 4.60 score 79 scriptsbioc
MetaCyto:MetaCyto: A package for meta-analysis of cytometry data
This package provides functions for preprocessing, automated gating and meta-analysis of cytometry data. It also provides functions that facilitate the collection of cytometry data from the ImmPort database.
Maintained by Zicheng Hu. Last updated 5 months ago.
immunooncologycellbiologyflowcytometryclusteringstatisticalmethodsoftwarecellbasedassayspreprocessing
9.7 match 4.73 score 18 scriptsalexanderrobitzsch
TAM:Test Analysis Modules
Includes marginal maximum likelihood estimation and joint maximum likelihood estimation for unidimensional and multidimensional item response models. The package functionality covers the Rasch model, 2PL model, 3PL model, generalized partial credit model, multi-faceted Rasch model, nominal item response model, structured latent class model, mixture distribution IRT models, and located latent class models. Latent regression models and plausible value imputation are also supported. For details see Adams, Wilson and Wang, 1997 <doi:10.1177/0146621697211001>, Adams, Wilson and Wu, 1997 <doi:10.3102/10769986022001047>, Formann, 1982 <doi:10.1002/bimj.4710240209>, Formann, 1992 <doi:10.1080/01621459.1992.10475229>.
Maintained by Alexander Robitzsch. Last updated 6 months ago.
item-response-theoryopenblascpp
5.1 match 16 stars 8.93 score 258 scripts 25 dependents