R-universe search: examination

bioc

scRepertoire:A toolkit for single-cell immune receptor profiling

scRepertoire is a toolkit for processing and analyzing single-cell T-cell receptor (TCR) and immunoglobulin (Ig). The scRepertoire framework supports use of 10x, AIRR, BD, MiXCR, Omniscope, TRUST4, and WAT3R single-cell formats. The functionality includes basic clonal analyses, repertoire summaries, distance-based clustering and interaction with the popular Seurat and SingleCellExperiment/Bioconductor R workflows.

Maintained by Nick Borcherding. Last updated 2 months ago.

software immunooncology singlecell classification annotation sequencing cpp

16.0 match 326 stars 10.49 score 240 scripts

r-forge

R2MLwiN:Running 'MLwiN' from Within R

An R command interface to the 'MLwiN' multilevel modelling software package.

Maintained by Zhengzheng Zhang. Last updated 5 months ago.

11.4 match 5.35 score 125 scripts

sfirke

janitor:Simple Tools for Examining and Cleaning Dirty Data

The main janitor functions can: perfectly format data.frame column names; provide quick counts of variable combinations (i.e., frequency tables and crosstabs); and explore duplicate records. Other janitor functions nicely format the tabulation results. These tabulate-and-report functions approximate popular features of SPSS and Microsoft Excel. This package follows the principles of the "tidyverse" and works well with the pipe function %>%. janitor was built with beginning-to-intermediate R users in mind and is optimized for user-friendliness.

Maintained by Sam Firke. Last updated 3 months ago.

data-analysis data-cleaning data-science dirty-data excel pivot-tables spss tabulations tidyverse

2.9 match 1.4k stars 19.15 score 35k scripts 231 dependents

easystats

datawizard:Easy Data Wrangling and Statistical Transformations

A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.

Maintained by Etienne Bacher. Last updated 20 hours ago.

data dplyr hacktoberfest janitor manipulation reshape tidyr wrangling

3.4 match 222 stars 14.74 score 436 scripts 119 dependents

nickch-k

causaldata:Example Data Sets for Causal Inference Textbooks

Example data sets to run the example problems from causal inference textbooks. Currently, contains data sets for Huntington-Klein, Nick (2021) "The Effect" <https://theeffectbook.net>, first and second edition, Cunningham, Scott (2021, ISBN-13: 978-0-300-25168-5) "Causal Inference: The Mixtape", and Hernán, Miguel and James Robins (2020) "Causal Inference: What If" <https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/>.

Maintained by Nick Huntington-Klein. Last updated 4 months ago.

6.6 match 136 stars 7.43 score 144 scripts 1 dependents

renkun-ken

rlist:A Toolbox for Non-Tabular Data Manipulation

Provides a set of functions for data manipulation with list objects, including mapping, filtering, grouping, sorting, updating, searching, and other useful functions. Most functions are designed to be pipeline friendly so that data processing with lists can be chained.

Maintained by Kun Ren. Last updated 2 years ago.

3.5 match 206 stars 13.73 score 2.2k scripts 123 dependents

edzer

hexbin:Hexagonal Binning Routines

Binning and plotting functions for hexagonal bins.

Maintained by Edzer Pebesma. Last updated 4 months ago.

fortran

3.4 match 37 stars 14.00 score 2.4k scripts 114 dependents

ropensci

rgbif:Interface to the Global Biodiversity Information Facility API

A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/developer/summary>). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.

Maintained by John Waller. Last updated 5 days ago.

gbif specimens api web-services occurrences species taxonomy biodiversity data lifewatch oscibio spocc

3.5 match 161 stars 13.26 score 2.1k scripts 20 dependents

hdarjus

exams.mylearn:Question Generation in the 'MyLearn' XML Format

Randomized multiple-select and single-select question generation for the 'MyLearn' teaching and learning platform. Question templates in the form of the R/exams package (see <http://www.r-exams.org/>) are transformed into XML format required by 'MyLearn'.

Maintained by Darjus Hosszejni. Last updated 4 years ago.

examination university

10.0 match 2 stars 4.00 score

wjakethompson

measr:Bayesian Psychometric Measurement Using 'Stan'

Estimate diagnostic classification models (also called cognitive diagnostic models) with 'Stan'. Diagnostic classification models are confirmatory latent class models, as described by Rupp et al. (2010, ISBN: 978-1-60623-527-0). Automatically generate 'Stan' code for the general loglinear cognitive diagnostic diagnostic model proposed by Henson et al. (2009) <doi:10.1007/s11336-008-9089-5> and other subtypes that introduce additional model constraints. Using the generated 'Stan' code, estimate the model evaluate the model's performance using model fit indices, information criteria, and reliability metrics.

Maintained by W. Jake Thompson. Last updated 2 months ago.

bayesian cdm cmdstanr cognitive-diagnosis cognitive-diagnostic-models dcm diagnostic-classification-models psychometrics rstan stan cpp

5.6 match 10 stars 6.75 score 31 scripts

tidyverse

readr:Read Rectangular Text Data

The goal of 'readr' is to provide a fast and friendly way to read rectangular data (like 'csv', 'tsv', and 'fwf'). It is designed to flexibly parse many types of data found in the wild, while still cleanly failing when data unexpectedly changes.

Maintained by Jennifer Bryan. Last updated 8 months ago.

csv fwf parsing cpp

1.8 match 1.0k stars 21.03 score 132k scripts 2.0k dependents

openintrostat

openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs

Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.

Maintained by Mine Çetinkaya-Rundel. Last updated 3 months ago.

data openintro

3.3 match 240 stars 11.39 score 6.0k scripts

usepa

httk:High-Throughput Toxicokinetics

Pre-made models that can be rapidly tailored to various chemicals and species using chemical-specific in vitro data and physiological information. These tools allow incorporation of chemical toxicokinetics ("TK") and in vitro-in vivo extrapolation ("IVIVE") into bioinformatics, as described by Pearce et al. (2017) (<doi:10.18637/jss.v079.i04>). Chemical-specific in vitro data characterizing toxicokinetics have been obtained from relatively high-throughput experiments. The chemical-independent ("generic") physiologically-based ("PBTK") and empirical (for example, one compartment) "TK" models included here can be parameterized with in vitro data or in silico predictions which are provided for thousands of chemicals, multiple exposure routes, and various species. High throughput toxicokinetics ("HTTK") is the combination of in vitro data and generic models. We establish the expected accuracy of HTTK for chemicals without in vivo data through statistical evaluation of HTTK predictions for chemicals where in vivo data do exist. The models are systems of ordinary differential equations that are developed in MCSim and solved using compiled (C-based) code for speed. A Monte Carlo sampler is included for simulating human biological variability (Ring et al., 2017 <doi:10.1016/j.envint.2017.06.004>) and propagating parameter uncertainty (Wambaugh et al., 2019 <doi:10.1093/toxsci/kfz205>). Empirically calibrated methods are included for predicting tissue:plasma partition coefficients and volume of distribution (Pearce et al., 2017 <doi:10.1007/s10928-017-9548-7>). These functions and data provide a set of tools for using IVIVE to convert concentrations from high-throughput screening experiments (for example, Tox21, ToxCast) to real-world exposures via reverse dosimetry (also known as "RTK") (Wetmore et al., 2015 <doi:10.1093/toxsci/kfv171>).

Maintained by John Wambaugh. Last updated 1 months ago.

comptox ord

3.6 match 27 stars 10.22 score 307 scripts 1 dependents

wviechtb

metadat:Meta-Analysis Datasets

A collection of meta-analysis datasets for teaching purposes, illustrating/testing meta-analytic methods, and validating published analyses.

Maintained by Wolfgang Viechtbauer. Last updated 4 days ago.

dataset datasets meta-analysis

3.4 match 30 stars 10.54 score 65 scripts 93 dependents

tidyverse

vroom:Read and Write Rectangular Text Data Quickly

The goal of 'vroom' is to read and write data (like 'csv', 'tsv' and 'fwf') quickly. When reading it uses a quick initial indexing step, then reads the values lazily , so only the data you actually use needs to be read. The writer formats the data in parallel and writes to disk asynchronously from formatting.

Maintained by Jennifer Bryan. Last updated 7 months ago.

csv csv-parser fixed-width-text tsv tsv-parser cpp

1.8 match 625 stars 17.78 score 4.5k scripts 2.1k dependents

spiritspeak

rapidsplithalf:A Fast Permutation-Based Split-Half Reliability Algorithm

Accurately estimates the reliability of cognitive tasks using a fast and flexible permutation-based split-half reliability algorithm that supports stratified splitting while maintaining equal split sizes. See Kahveci, Bathke, and Blechert (2022) <doi:10.31234/osf.io/ta59r> for details.

Maintained by Sercan Kahveci. Last updated 13 days ago.

cpp

6.7 match 4.78 score 5 scripts

lvclark

polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids

Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.

Maintained by Lindsay V. Clark. Last updated 10 days ago.

bioinformatics dna-sequencing genotype-likelihoods genotyping-by-sequencing hacktoberfest rad-seq rad-sequencing snp-genotyping cpp

4.5 match 28 stars 6.98 score 85 scripts

cardiomoon

moonBook:Functions and Datasets for the Book by Keon-Woong Moon

Several analysis-related functions for the book entitled "R statistics and graph for medical articles" (written in Korean), version 1, by Keon-Woong Moon with Korean demographic data with several plot functions.

Maintained by Keon-Woong Moon. Last updated 1 years ago.

3.3 match 37 stars 9.66 score 278 scripts 5 dependents

wenchao-ma

GDINA:The Generalized DINA Model Framework

A set of psychometric tools for cognitive diagnosis modeling based on the generalized deterministic inputs, noisy and gate (G-DINA) model by de la Torre (2011) <DOI:10.1007/s11336-011-9207-7> and its extensions, including the sequential G-DINA model by Ma and de la Torre (2016) <DOI:10.1111/bmsp.12070> for polytomous responses, and the polytomous G-DINA model by Chen and de la Torre <DOI:10.1177/0146621613479818> for polytomous attributes. Joint attribute distribution can be independent, saturated, higher-order, loglinear smoothed or structured. Q-matrix validation, item and model fit statistics, model comparison at test and item level and differential item functioning can also be conducted. A graphical user interface is also provided. For tutorials, please check Ma and de la Torre (2020) <DOI:10.18637/jss.v093.i14>, Ma and de la Torre (2019) <DOI:10.1111/emip.12262>, Ma (2019) <DOI:10.1007/978-3-030-05584-4_29> and de la Torre and Akbay (2019).

Maintained by Wenchao Ma. Last updated 1 months ago.

cdm cognitive-diagnosis dcm dina-model dino estimation-models gdina item-response-theory psychometrics openblas cpp

3.5 match 30 stars 8.92 score 94 scripts 6 dependents

bioc

BiocFHIR:Illustration of FHIR ingestion and transformation using R

FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.

Maintained by Vincent Carey. Last updated 5 months ago.

infrastructure dataimport datarepresentation fhir

5.3 match 4 stars 5.78 score 15 scripts

pharmaverse

pharmaversesdtm:SDTM Test Data for the 'Pharmaverse' Family of Packages

A set of Study Data Tabulation Model (SDTM) datasets from the Clinical Data Interchange Standards Consortium (CDISC) pilot project used for testing and developing Analysis Data Model (ADaM) datasets inside the pharmaverse family of packages. SDTM dataset specifications are described in the CDISC SDTM implementation guide, accessible by creating a free account on <https://www.cdisc.org/>.

Maintained by Edoardo Mancini. Last updated 1 days ago.

4.0 match 15 stars 7.46 score 143 scripts

cran

bnlearn:Bayesian Network Structure Learning, Parameter Learning and Inference

Bayesian network structure learning, parameter learning and inference. This package implements constraint-based (PC, GS, IAMB, Inter-IAMB, Fast-IAMB, MMPC, Hiton-PC, HPC), pairwise (ARACNE and Chow-Liu), score-based (Hill-Climbing and Tabu Search) and hybrid (MMHC, RSMAX2, H2PC) structure learning algorithms for discrete, Gaussian and conditional Gaussian networks, along with many score functions and conditional independence tests. The Naive Bayes and the Tree-Augmented Naive Bayes (TAN) classifiers are also implemented. Some utility functions (model comparison and manipulation, random data generation, arc orientation testing, simple and advanced plots) are included, as well as support for parameter estimation (maximum likelihood and Bayesian) and inference, conditional probability queries, cross-validation, bootstrap and model averaging. Development snapshots with the latest bugfixes are available from <https://www.bnlearn.com/>.

Maintained by Marco Scutari. Last updated 2 months ago.

openblas

3.8 match 57 stars 7.72 score 32 dependents

mjskay

ARTool:Aligned Rank Transform

The aligned rank transform for nonparametric factorial ANOVAs as described by Wobbrock, Findlater, Gergle, and Higgins (2011) <doi:10.1145/1978942.1978963>. Also supports aligned rank transform contrasts as described by Elkin, Kay, Higgins, and Wobbrock (2021) <doi:10.1145/3472749.3474784>.

Maintained by Matthew Kay. Last updated 3 years ago.

3.3 match 60 stars 8.64 score 307 scripts

tmsalab

edmdata:Data Sets for Psychometric Modeling

Collection of data sets from various assessments that can be used to evaluate psychometric models. These data sets have been analyzed in the following papers that introduced new methodology as part of the application section: Jimenez, A., Balamuta, J. J., & Culpepper, S. A. (2023) <doi:10.1111/bmsp.12307>, Culpepper, S. A., & Balamuta, J. J. (2021) <doi:10.1080/00273171.2021.1985949>, Yinghan Chen et al. (2021) <doi:10.1007/s11336-021-09750-9>, Yinyin Chen et al. (2020) <doi:10.1007/s11336-019-09693-2>, Culpepper, S. A. (2019a) <doi:10.1007/s11336-019-09683-4>, Culpepper, S. A. (2019b) <doi:10.1007/s11336-018-9643-8>, Culpepper, S. A., & Chen, Y. (2019) <doi:10.3102/1076998618791306>, Culpepper, S. A., & Balamuta, J. J. (2017) <doi:10.1007/s11336-015-9484-7>, and Culpepper, S. A. (2015) <doi:10.3102/1076998615595403>.

Maintained by James Joseph Balamuta. Last updated 6 months ago.

cognitive-diagnostic-models data edm

6.8 match 5 stars 4.18 score 7 scripts 1 dependents

scottkosty

bootstrap:Functions for the Book "An Introduction to the Bootstrap"

Software (bootstrap, cross-validation, jackknife) and data for the book "An Introduction to the Bootstrap" by B. Efron and R. Tibshirani, 1993, Chapman and Hall. This package is primarily provided for projects already based on it, and for support of the book. New projects should preferentially use the recommended package "boot".

Maintained by Scott Kostyshak. Last updated 6 years ago.

fortran

3.6 match 7.62 score 890 scripts 30 dependents

nerler

JointAI:Joint Analysis and Imputation of Incomplete Data

Joint analysis and imputation of incomplete data in the Bayesian framework, using (generalized) linear (mixed) models and extensions there of, survival models, or joint models for longitudinal and survival data, as described in Erler, Rizopoulos and Lesaffre (2021) <doi:10.18637/jss.v100.i20>. Incomplete covariates, if present, are automatically imputed. The package performs some preprocessing of the data and creates a 'JAGS' model, which will then automatically be passed to 'JAGS' <https://mcmc-jags.sourceforge.io/> with the help of the package 'rjags'.

Maintained by Nicole S. Erler. Last updated 12 months ago.

bayesian generalized-linear-models glm glmm imputation imputations jags joint-analysis linear-mixed-models linear-regression-models mcmc-sample mcmc-sampling missing-data missing-values survival cpp

3.4 match 28 stars 7.30 score 59 scripts 1 dependents

nutriverse

zscorer:Child Anthropometry z-Score Calculator

A tool for calculating z-scores and centiles for weight-for-age, length/height-for-age, weight-for-length/height, BMI-for-age, head circumference-for-age, age circumference-for-age, subscapular skinfold-for-age, triceps skinfold-for-age based on the WHO Child Growth Standards.

Maintained by Ernest Guevarra. Last updated 4 years ago.

anthropometric-indices anthropometry growth-charts growth-standards height-for-age nutrition weight-for-age weight-for-height z-score

3.4 match 14 stars 7.30 score 47 scripts 1 dependents

veronica0206

nlpsem:Linear and Nonlinear Longitudinal Process in Structural Equation Modeling Framework

Provides computational tools for nonlinear longitudinal models, in particular the intrinsically nonlinear models, in four scenarios: (1) univariate longitudinal processes with growth factors, with or without covariates including time-invariant covariates (TICs) and time-varying covariates (TVCs); (2) multivariate longitudinal processes that facilitate the assessment of correlation or causation between multiple longitudinal variables; (3) multiple-group models for scenarios (1) and (2) to evaluate differences among manifested groups, and (4) longitudinal mixture models for scenarios (1) and (2), with an assumption that trajectories are from multiple latent classes. The methods implemented are introduced in Jin Liu (2023) <arXiv:2302.03237v2>.

Maintained by Jin Liu. Last updated 4 months ago.

3.6 match 145 stars 6.91 score 16 scripts

kharchenkolab

pagoda2:Single Cell Analysis and Differential Expression

Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.

Maintained by Evan Biederstedt. Last updated 1 years ago.

scrna-seq single-cell single-cell-rna-seq transcriptomics openblas cpp openmp

3.1 match 222 stars 8.00 score 282 scripts

bioc

RImmPort:RImmPort: Enabling Ready-for-analysis Immunology Research Data

The RImmPort package simplifies access to ImmPort data for analysis in the R environment. It provides a standards-based interface to the ImmPort study data that is in a proprietary format.

Maintained by Zicheng Hu. Last updated 5 months ago.

biomedicalinformatics dataimport datarepresentation

5.7 match 4.33 score 27 scripts

openpharma

crmPack:Object-Oriented Implementation of CRM Designs

Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules. Further details are presented in Sabanes Bove et al. (2019) <doi:10.18637/jss.v089.i10>.

Maintained by Daniel Sabanes Bove. Last updated 2 months ago.

jags cpp

3.0 match 21 stars 7.79 score 208 scripts

billdenney

PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis

Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.

Maintained by Bill Denney. Last updated 18 days ago.

nca noncompartmental-analysis pharmacokinetics

1.7 match 73 stars 12.61 score 214 scripts 4 dependents

florianstijven

Surrogate:Evaluation of Surrogate Endpoints in Clinical Trials

In a clinical trial, it frequently occurs that the most credible outcome to evaluate the effectiveness of a new therapy (the true endpoint) is difficult to measure. In such a situation, it can be an effective strategy to replace the true endpoint by a (bio)marker that is easier to measure and that allows for a prediction of the treatment effect on the true endpoint (a surrogate endpoint). The package 'Surrogate' allows for an evaluation of the appropriateness of a candidate surrogate endpoint based on the meta-analytic, information-theoretic, and causal-inference frameworks. Part of this software has been developed using funding provided from the European Union's Seventh Framework Programme for research, technological development and demonstration (Grant Agreement no 602552), the Special Research Fund (BOF) of Hasselt University (BOF-number: BOF2OCPO3), GlaxoSmithKline Biologicals, Baekeland Mandaat (HBC.2022.0145), and Johnson & Johnson Innovative Medicine.

Maintained by Wim Van Der Elst. Last updated 19 days ago.

3.3 match 1 stars 6.42 score 133 scripts

cddesja

profileR:Profile Analysis of Multivariate Data in R

A suite of multivariate methods and data visualization tools to implement profile analysis and cross-validation techniques described in Davison & Davenport (2002) <DOI: 10.1037/1082-989X.7.4.468>, Bulut (2013), and other published and unpublished resources. The package includes routines to perform criterion-related profile analysis, profile analysis via multidimensional scaling, moderated profile analysis, generalizability theory, profile analysis by group, and a within-person factor model to derive score profiles.

Maintained by Christopher David Desjardins. Last updated 2 years ago.

3.8 match 3 stars 5.65 score 50 scripts

predictiveecology

reproducible:Enhance Reproducibility of R Code

A collection of high-level, machine- and OS-independent tools for making reproducible and reusable content in R. The two workhorse functions are Cache() and prepInputs(). Cache() allows for nested caching, is robust to environments and objects with environments (like functions), and deals with some classes of file-backed R objects e.g., from terra and raster packages. Both functions have been developed to be foundational components of data retrieval and processing in continuous workflow situations. In both functions, efforts are made to make the first and subsequent calls of functions have the same result, but faster at subsequent times by way of checksums and digesting. Several features are still under development, including cloud storage of cached objects allowing for sharing between users. Several advanced options are available, see ?reproducibleOptions().

Maintained by Eliot J B McIntire. Last updated 1 months ago.

reproducibility reproducible-research

2.0 match 41 stars 10.52 score 122 scripts 15 dependents

andreanini

idiolect:Forensic Authorship Analysis

Carry out comparative authorship analysis of disputed and undisputed texts within the Likelihood Ratio Framework for expressing evidence in forensic science. This package contains implementations of well-known algorithms for comparative authorship analysis, such as Smith and Aldridge's (2011) Cosine Delta <doi:10.1080/09296174.2011.533591> or Koppel and Winter's (2014) Impostors Method <doi:10.1002/asi.22954>, as well as functions to measure their performance and to calibrate their outputs into Log-Likelihood Ratios.

Maintained by Andrea Nini. Last updated 11 days ago.

3.3 match 14 stars 6.12 score 3 scripts

tjheaton

carbondate:Calibration and Summarisation of Radiocarbon Dates

Performs Bayesian non-parametric calibration of multiple related radiocarbon determinations, and summarises the calendar age information to plot their joint calendar age density (see Heaton (2022) <doi:10.1111/rssc.12599>). Also models the occurrence of radiocarbon samples as a variable-rate (inhomogeneous) Poisson process, plotting the posterior estimate for the occurrence rate of the samples over calendar time, and providing information about potential change points.

Maintained by Timothy J Heaton. Last updated 2 months ago.

cpp

3.5 match 5 stars 5.78 score 20 scripts

danlwarren

rwty:R We There Yet? Visualizing MCMC Convergence in Phylogenetics

Implements various tests, visualizations, and metrics for diagnosing convergence of MCMC chains in phylogenetics. It implements and automates many of the functions of the AWTY package in the R environment, as well as a host of other functions. Warren, Geneva, and Lanfear (2017), <doi:10.1093/molbev/msw279>.

Maintained by Dan Warren. Last updated 4 years ago.

2.8 match 30 stars 7.32 score 117 scripts

eltebioinformatics

mulea:Enrichment Analysis Using Multiple Ontologies and False Discovery Rate

Background - Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. Results - mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. Conclusions - mulea is distributed as a CRAN R package. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.

Maintained by Tamas Stirling. Last updated 3 months ago.

annotation differentialexpression geneexpression genesetenrichment go graphandnetwork multiplecomparison pathways reactome software transcription visualization enrichment enrichment-analysis functional-enrichment-analysis gene-set-enrichment ontologies transcriptomics cpp

2.7 match 28 stars 7.36 score 34 scripts

bioc

weitrix:Tools for matrices with precision weights, test and explore weighted or sparse data

Data type and tools for working with matrices having precision weights and missing data. This package provides a common representation and tools that can be used with many types of high-throughput data. The meaning of the weights is compatible with usage in the base R function "lm" and the package "limma". Calibrate weights to account for known predictors of precision. Find rows with excess variability. Perform differential testing and find rows with the largest confident differences. Find PCA-like components of variation even with many missing values, rotated so that individual components may be meaningfully interpreted. DelayedArray matrices and BiocParallel are supported.

Maintained by Paul Harrison. Last updated 5 months ago.

software datarepresentation dimensionreduction geneexpression transcriptomics rnaseq singlecell regression

4.2 match 4.70 score 8 scripts

bodkan

admixr:An Interface for Running 'ADMIXTOOLS' Analyses

An interface for performing all stages of 'ADMIXTOOLS' analyses (<https://reich.hms.harvard.edu/software>) entirely from R. Wrapper functions (D, f4, f3, etc.) completely automate the generation of intermediate configuration files, run 'ADMIXTOOLS' programs on the command-line, and parse output files to extract values of interest. This allows users to focus on the analysis itself instead of worrying about low-level technical details. A set of complementary functions for processing and filtering of data in the 'EIGENSTRAT' format is also provided.

Maintained by Martin Petr. Last updated 28 days ago.

bioinformatics popgen population-genetics

2.6 match 29 stars 7.42 score 91 scripts

vegawidget

vegawidget:'Htmlwidget' for 'Vega' and 'Vega-Lite'

'Vega' and 'Vega-Lite' parse text in 'JSON' notation to render chart-specifications into 'HTML'. This package is used to facilitate the rendering. It also provides a means to interact with signals, events, and datasets in a 'Vega' chart using 'JavaScript' or 'Shiny'.

Maintained by Ian Lyttle. Last updated 1 years ago.

2.3 match 68 stars 8.04 score 49 scripts 4 dependents

statnet

ergm:Fit, Simulate and Diagnose Exponential-Family Models for Networks

An integrated set of tools to analyze and simulate networks based on exponential-family random graph models (ERGMs). 'ergm' is a part of the Statnet suite of packages for network analysis. See Hunter, Handcock, Butts, Goodreau, and Morris (2008) <doi:10.18637/jss.v024.i03> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.

Maintained by Pavel N. Krivitsky. Last updated 8 days ago.

1.2 match 100 stars 15.36 score 1.4k scripts 36 dependents

jbryer

TriMatch:Propensity Score Matching of Non-Binary Treatments

Propensity score matching for non-binary treatments.

Maintained by Jason Bryer. Last updated 7 years ago.

3.4 match 13 stars 5.27 score 32 scripts 1 dependents

dicook

nullabor:Tools for Graphical Inference

Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.

Maintained by Di Cook. Last updated 1 months ago.

1.7 match 57 stars 10.38 score 370 scripts 2 dependents

raydanner

SongEvo:An Individual-Based Model of Bird Song Evolution

Simulates the cultural evolution of quantitative traits of bird song. 'SongEvo' is an individual- (agent-) based model. 'SongEvo' is spatially-explicit and can be parameterized with, and tested against, measured song data. Functions are available for model implementation, sensitivity analyses, parameter optimization, model validation, and hypothesis testing.

Maintained by Raymond Danner. Last updated 5 years ago.

3.8 match 2 stars 4.51 score 16 scripts

rpruim

NHANES:Data from the US National Health and Nutrition Examination Study

Body Shape and related measurements from the US National Health and Nutrition Examination Survey (NHANES, 1999-2004). See http://www.cdc.gov/nchs/nhanes.htm for details.

Maintained by Randall Pruim. Last updated 10 years ago.

3.4 match 5.10 score 880 scripts

hugaped

MBNMAdose:Dose-Response MBNMA Models

Fits Bayesian dose-response model-based network meta-analysis (MBNMA) that incorporate multiple doses within an agent by modelling different dose-response functions, as described by Mawdsley et al. (2016) <doi:10.1002/psp4.12091>. By modelling dose-response relationships this can connect networks of evidence that might otherwise be disconnected, and can improve precision on treatment estimates. Several common dose-response functions are provided; others may be added by the user. Various characteristics and assumptions can be flexibly added to the models, such as shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting at the treatment level.

Maintained by Hugo Pedder. Last updated 1 months ago.

jags cpp

2.6 match 10 stars 6.60 score

rsetienne

secsse:Several Examined and Concealed States-Dependent Speciation and Extinction

Simultaneously infers state-dependent diversification across two or more states of a single or multiple traits while accounting for the role of a possible concealed trait. See Herrera-Alsina et al. (2019) <doi:10.1093/sysbio/syy057>.

Maintained by Rampal S. Etienne. Last updated 11 months ago.

cpp

2.9 match 1 stars 5.83 score 34 scripts

jmgirard

circumplex:Analysis and Visualization of Circular Data

Circumplex models, which organize constructs in a circle around two underlying dimensions, are popular for studying interpersonal functioning, mood/affect, and vocational preferences/environments. This package provides tools for analyzing and visualizing circular data, including scoring functions for relevant instruments and a generalization of the bootstrapped structural summary method from Zimmermann & Wright (2017) <doi:10.1177/1073191115621795> and functions for creating publication-ready tables and figures from the results.

Maintained by Jeffrey Girard. Last updated 5 months ago.

circular circumplex data-analysis ggplot2 interpersonal psychology rcpparmadillo tidyverse openblas cpp openmp

2.5 match 11 stars 6.54 score 52 scripts

cldossantos

pacu:Precision Agriculture Computational Utilities

Support for a variety of commonly used precision agriculture operations. Includes functions to download and process raw satellite images from Sentinel-2 <https://documentation.dataspace.copernicus.eu/APIs/OData.html>. Includes functions that download vegetation index statistics for a given period of time, without the need to download the raw images <https://documentation.dataspace.copernicus.eu/APIs/SentinelHub/Statistical.html>. There are also functions to download and visualize weather data in a historical context. Lastly, the package also contains functions to process yield monitor data. These functions can build polygons around recorded data points, evaluate the overlap between polygons, clean yield data, and smooth yield maps.

Maintained by dos Santos Caio. Last updated 5 days ago.

2.4 match 14 stars 6.82 score 9 scripts

cjvanlissa

tidySEM:Tidy Structural Equation Modeling

A tidy workflow for generating, estimating, reporting, and plotting structural equation models using 'lavaan', 'OpenMx', or 'Mplus'. Throughout this workflow, elements of syntax, results, and graphs are represented as 'tidy' data, making them easy to customize. Includes functionality to estimate latent class analyses, and to plot 'dagitty' and 'igraph' objects.

Maintained by Caspar J. van Lissa. Last updated 9 days ago.

1.5 match 58 stars 10.69 score 330 scripts 1 dependents

tunsmart

interactionR:Full Reporting of Interaction Analyses

Produces a publication-ready table that includes all effect estimates necessary for full reporting effect modification and interaction analysis as recommended by Knol and Vanderweele (2012) [<doi:10.1093/ije/dyr218>]. It also estimates confidence interval for the trio of additive interaction measures using the delta method (see Hosmer and Lemeshow (1992), [<doi:10.1097/00001648-199209000-00012>]), variance recovery method (see Zou (2008), [<doi:10.1093/aje/kwn104>]), or percentile bootstrapping (see Assmann et al. (1996), [<doi:10.1097/00001648-199605000-00012>]).

Maintained by Babatunde Alli. Last updated 1 years ago.

3.2 match 20 stars 4.92 score 84 scripts

mgirlich

tibblify:Rectangle Nested Lists

A tool to rectangle a nested list, that is to convert it into a tibble. This is done automatically or according to a given specification. A common use case is for nested lists coming from parsing JSON files or the JSON response of REST APIs. It is supported by the 'vctrs' package and therefore offers a wide support of vector types.

Maintained by Maximilian Girlich. Last updated 1 years ago.

cpp

2.0 match 68 stars 7.85 score 50 scripts 7 dependents

briencj

dae:Functions Useful in the Design and ANOVA of Experiments

The content falls into the following groupings: (i) Data, (ii) Factor manipulation functions, (iii) Design functions, (iv) ANOVA functions, (v) Matrix functions, (vi) Projector and canonical efficiency functions, and (vii) Miscellaneous functions. There is a vignette describing how to use the design functions for randomizing and assessing designs available as a vignette called 'DesignNotes'. The ANOVA functions facilitate the extraction of information when the 'Error' function has been used in the call to 'aov'. The package 'dae' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 4 months ago.

1.8 match 1 stars 8.62 score 356 scripts 7 dependents

oscarkjell

text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning

Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.

Maintained by Oscar Kjell. Last updated 5 days ago.

deep-learning machine-learning nlp transformers openjdk

1.2 match 146 stars 13.16 score 436 scripts 1 dependents

matildabrown

hyperoverlap:Overlap Detection in n-Dimensional Space

Uses support vector machines to identify a perfectly separating hyperplane (linear or curvilinear) between two entities in high-dimensional space. If this plane exists, the entities do not overlap. Applications include overlap detection in morphological, resource or environmental dimensions. More details can be found in: Brown et al. (2020) <doi:10.1111/2041-210X.13363> .

Maintained by Matilda Brown. Last updated 4 years ago.

3.6 match 4 stars 4.30 score 7 scripts

gaynorr

AlphaSimR:Breeding Program Simulations

The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) <doi:10.3835/plantgenome2016.02.0013>]. Used for stochastic simulations of breeding programs to the level of DNA sequence for every individual. Contained is a wide range of functions for modeling common tasks in a breeding program, such as selection and crossing. These functions allow for constructing simulations of highly complex plant and animal breeding programs via scripting in the R software environment. Such simulations can be used to evaluate overall breeding program performance and conduct research into breeding program design, such as implementation of genomic selection. Included is the 'Markovian Coalescent Simulator' ('MaCS') for fast simulation of biallelic sequences according to a population demographic history [Chen et al. (2009) <doi:10.1101/gr.083634.108>].

Maintained by Chris Gaynor. Last updated 5 months ago.

breeding genomics simulation openblas cpp openmp

1.5 match 47 stars 10.22 score 534 scripts 2 dependents

bioc

memes:motif matching, comparison, and de novo discovery using the MEME Suite

A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.

Maintained by Spencer Nystrom. Last updated 5 months ago.

dataimport functionalgenomics generegulation motifannotation motifdiscovery sequencematching software

1.8 match 49 stars 8.68 score 117 scripts 1 dependents

gabriellajg

equaltestMI:Examine Measurement Invariance via Equivalence Testing and Projection Method

Functions for examining measurement invariance via equivalence testing are included in this package. The traditionally used RMSEA (Root Mean Square Error of Approximation) cutoff values are adjusted based on simulation results. In addition, a projection-based method is implemented to test the equality of latent factor means across groups without assuming the equality of intercepts. For more information, see Yuan, K. H., & Chan, W. (2016) <doi:10.1037/met0000080>, Deng, L., & Yuan, K. H. (2016) <doi:10.1007/s11336-015-9491-8>, and Jiang, G., Mai, Y., & Yuan, K. H. (2017) <doi:10.3389/fpsyg.2017.01823>.

Maintained by Ge Jiang. Last updated 4 years ago.

3.3 match 1 stars 4.58 score 19 scripts

cb4ds

DGEobj:Differential Gene Expression (DGE) Analysis Results Data Object

Provides a flexible container to manage and annotate Differential Gene Expression (DGE) analysis results (Smythe et. al (2015) <doi:10.1093/nar/gkv007>). The DGEobj has data slots for row (gene), col (samples), assays (matrix n-rows by m-samples dimensions) and metadata (not keyed to row, col, or assays). A set of accessory functions to deposit, query and retrieve subsets of a data workflow has been provided. Attributes are used to capture metadata such as species and gene model, including reproducibility information such that a 3rd party can access a DGEobj history to see how each data object was created or modified. Since the DGEobj is customizable and extensible it is not limited to RNA-seq analysis types of workflows -- it can accommodate nearly any data analysis workflow that starts from a matrix of assays (rows) by samples (columns).

Maintained by Connie Brett. Last updated 2 months ago.

2.7 match 2 stars 5.60 score 33 scripts 2 dependents

emitanaka

edibble:Encapsulating Elements of Experimental Design

A system to facilitate designing comparative (and non-comparative) experiments using the grammar of experimental designs <https://emitanaka.org/edibble-book/>. An experimental design is treated as an intermediate, mutable object that is built progressively by fundamental experimental components like units, treatments, and their relation. The system aids in experimental planning, management and workflow.

Maintained by Emi Tanaka. Last updated 4 months ago.

experimental-designs

2.0 match 217 stars 7.43 score 62 scripts

keithmcnulty

peopleanalyticsdata:Data Sets for Keith McNulty's Handbook of Regression Modeling in People Analytics

Data sets for statistical inference modeling related to People Analytics. Contains various data sets from the book 'Handbook of Regression Modeling in People Analytics' by Keith McNulty (2020).

Maintained by Keith McNulty. Last updated 4 years ago.

4.0 match 6 stars 3.71 score 17 scripts

cran

excessmort:Excess Mortality

Implementation of method for estimating excess mortality and other health related outcomes from weekly or daily count data described in Acosta and Irizarry (2021) "A Flexible Statistical Framework for Estimating Excess Mortality".

Maintained by Rafael A. Irizarry. Last updated 4 months ago.

5.4 match 2.70 score 25 scripts

maeveupton

reslr:Modelling Relative Sea Level Data

The Bayesian modelling of relative sea-level data using a comprehensive approach that incorporates various statistical models within a unifying framework. Details regarding each statistical models; linear regression (Ashe et al 2019) <doi:10.1016/j.quascirev.2018.10.032>, change point models (Cahill et al 2015) <doi:10.1088/1748-9326/10/8/084002>, integrated Gaussian process models (Cahill et al 2015) <doi:10.1214/15-AOAS824>, temporal splines (Upton et al 2023) <arXiv:2301.09556>, spatio-temporal splines (Upton et al 2023) <arXiv:2301.09556> and generalised additive models (Upton et al 2023) <arXiv:2301.09556>. This package facilitates data loading, model fitting and result summarisation. Notably, it accommodates the inherent measurement errors found in relative sea-level data across multiple dimensions, allowing for their inclusion in the statistical models.

Maintained by Maeve Upton. Last updated 1 years ago.

jags cpp

2.8 match 4 stars 5.23 score 28 scripts

allanvc

mRpostman:An IMAP Client for R

An easy-to-use IMAP client that provides tools for message searching, selective fetching of message attributes, mailbox management, attachment extraction, and several other IMAP features, paving the way for e-mail data analysis in R.

Maintained by Allan Quadros. Last updated 6 months ago.

2.5 match 31 stars 5.92 score 18 scripts

pmartr

pmartR:Panomics Marketplace - Quality Control and Statistical Analysis for Panomics Data

Provides functionality for quality control processing and statistical analysis of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level), lipidomic, and metabolomic data, as well as RNA-seq based count data and nuclear magnetic resonance (NMR) data. This includes data transformation, specification of groups that are to be compared against each other, filtering of features and/or samples, data normalization, data summarization (correlation, PCA), and statistical comparisons between defined groups. Implements methods described in: Webb-Robertson et al. (2014) <doi:10.1074/mcp.M113.030932>. Webb-Robertson et al. (2011) <doi:10.1002/pmic.201100078>. Matzke et al. (2011) <doi:10.1093/bioinformatics/btr479>. Matzke et al. (2013) <doi:10.1002/pmic.201200269>. Polpitiya et al. (2008) <doi:10.1093/bioinformatics/btn217>. Webb-Robertson et al. (2010) <doi:10.1021/pr1005247>.

Maintained by Lisa Bramer. Last updated 5 days ago.

data-summarization lipids mass-spectrometry metabolites metabolomics-data peptides proteins rna-seq-analysis openblas cpp

1.9 match 40 stars 7.69 score 144 scripts

bioc

scDblFinder:scDblFinder

The scDblFinder package gathers various methods for the detection and handling of doublets/multiplets in single-cell sequencing data (i.e. multiple cells captured within the same droplet or reaction volume). It includes methods formerly found in the scran package, the new fast and comprehensive scDblFinder method, and a reimplementation of the Amulet detection method for single-cell ATAC-seq.

Maintained by Pierre-Luc Germain. Last updated 2 months ago.

preprocessing singlecell rnaseq atacseq doublets single-cell

1.2 match 184 stars 12.34 score 888 scripts 1 dependents

fsbmat-ufv

ssmodels:Sample Selection Models

In order to facilitate the adjustment of the sample selection models existing in the literature, we created the 'ssmodels' package. Our package allows the adjustment of the classic Heckman model (Heckman (1976), Heckman (1979) <doi:10.2307/1912352>), and the estimation of the parameters of this model via the maximum likelihood method and two-step method, in addition to the adjustment of the Heckman-t models, introduced in the literature by Marchenko and Genton (2012) <doi:10.1080/01621459.2012.656011> and the Heckman-Skew model introduced in the literature by Ogundimu and Hutton (2016) <doi:10.1111/sjos.12171>. We also implemented functions to adjust the generalized version of the Heckman model, introduced by Bastos, Barreto-Souza, and Genton (2021) <doi:10.5705/ss.202021.0068>, that allows the inclusion of covariables to the dispersion and correlation parameters and a function to adjust the Heckman-BS model introduced by Bastos and Barreto-Souza (2020) <doi:10.1080/02664763.2020.1780570> that uses the Birnbaum-Saunders distribution as a joint distribution of the selection and primary regression variables.

Maintained by Fernando de Souza Bastos. Last updated 2 years ago.

3.5 match 1 stars 4.00 score 5 scripts

alicepaul

HDSinRdata:Data for the 'Mastering Health Data Science Using R' Online Textbook

Contains ten datasets used in the chapters and exercises of Paul, Alice (2023) "Health Data Science in R" <https://alicepaul.github.io/health-data-science-using-r/>.

Maintained by Alice Paul. Last updated 3 months ago.

3.4 match 1 stars 4.09 score 41 scripts

insongkim

PanelMatch:Matching Methods for Causal Inference with Time-Series Cross-Sectional Data

Implements a set of methodological tools that enable researchers to apply matching methods to time-series cross-sectional data. Imai, Kim, and Wang (2018) <http://web.mit.edu/insong/www/pdf/tscs.pdf> proposes a nonparametric generalization of the difference-in-differences estimator, which does not rely on the linearity assumption as often done in practice. Researchers first select a method of matching each treated observation for a given unit in a particular time period with control observations from other units in the same time period that have a similar treatment and covariate history. These methods include standard matching methods based on propensity score and Mahalanobis distance, as well as weighting methods. Once matching is done, both short-term and long-term average treatment effects for the treated can be estimated with standard errors. The package also offers a visualization technique that allows researchers to assess the quality of matches by examining the resulting covariate balance.

Maintained by In Song Kim. Last updated 18 days ago.

cpp

1.8 match 121 stars 7.70 score 104 scripts

cran

rosetta:Parallel Use of Statistical Packages in Teaching

When teaching statistics, it can often be desirable to uncouple the content from specific software packages. To ease such efforts, the Rosetta Stats website (<https://rosettastats.com>) allows comparing analyses in different packages. This package is the companion to the Rosetta Stats website, aiming to provide functions that produce output that is similar to output from other statistical packages, thereby facilitating 'software-agnostic' teaching of statistics.

Maintained by Gjalt-Jorn Peters. Last updated 2 years ago.

5.0 match 2.70 score

bioc

BiocPkgTools:Collection of simple tools for learning about Bioconductor Packages

Bioconductor has a rich ecosystem of metadata around packages, usage, and build status. This package is a simple collection of functions to access that metadata from R. The goal is to expose metadata for data mining and value-added functionality such as package searching, text mining, and analytics on packages.

Maintained by Sean Davis. Last updated 14 days ago.

software infrastructure bioconductor metadata

1.8 match 21 stars 7.67 score 68 scripts

kenkellner

jagsUI:A Wrapper Around 'rjags' to Streamline 'JAGS' Analyses

A set of wrappers around 'rjags' functions to run Bayesian analyses in 'JAGS' (specifically, via 'libjags'). A single function call can control adaptive, burn-in, and sampling MCMC phases, with MCMC chains run in sequence or in parallel. Posterior distributions are automatically summarized (with the ability to exclude some monitored nodes if desired) and functions are available to generate figures based on the posteriors (e.g., predictive check plots, traceplots). Function inputs, argument syntax, and output format are nearly identical to the 'R2WinBUGS'/'R2OpenBUGS' packages to allow easy switching between MCMC samplers.

Maintained by Ken Kellner. Last updated 1 months ago.

jags cpp

1.3 match 35 stars 10.02 score 1.4k scripts 7 dependents

canmod

macpan2:Fast and Flexible Compartmental Modelling

Fast and flexible compartmental modelling with Template Model Builder.

Maintained by Steve Walker. Last updated 17 hours ago.

compartmental-models epidemiology forecasting mixed-effects model-fitting optimization simulation simulation-modeling cpp

1.5 match 4 stars 8.90 score 246 scripts 1 dependents

tyee001

VGAMdata:Data Supporting the 'VGAM' Package

Mainly data sets to accompany the VGAM package and the book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7>. These are used to illustrate vector generalized linear and additive models (VGLMs/VGAMs), and associated models (Reduced-Rank VGLMs, Quadratic RR-VGLMs, Row-Column Interaction Models, and constrained and unconstrained ordination models in ecology). This package now contains some old VGAM family functions which have been replaced by newer ones (often because they are now special cases).

Maintained by Thomas Yee. Last updated 1 months ago.

4.5 match 1 stars 2.94 score 95 scripts 1 dependents

profpetrie

regclass:Tools for an Introductory Class in Regression and Modeling

Contains basic tools for visualizing, interpreting, and building regression models. It has been designed for use with the book Introduction to Regression and Modeling with R by Adam Petrie, Cognella Publishers, ISBN: 978-1-63189-250-9 <https://titles.cognella.com/introduction-to-regression-and-modeling-with-r-9781631892509>.

Maintained by Adam Petrie. Last updated 5 years ago.

3.4 match 3.90 score 301 scripts 1 dependents

bwiernik

configural:Multivariate Profile Analysis

R functions for criterion profile analysis, Davison and Davenport (2002) <doi:10.1037/1082-989X.7.4.468> and meta-analytic criterion profile analysis, Wiernik, Wilmot, Davison, and Ones (2020) <doi:10.1037/met0000305>. Sensitivity analyses to aid in interpreting criterion profile analysis results are also included.

Maintained by Brenton M. Wiernik. Last updated 1 years ago.

3.3 match 4 stars 3.96 score 23 scripts

melissa-wong

pomcheckr:Graphical Check for Proportional Odds Assumption

Implements the method described at the UCLA Statistical Consulting site <https://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/> for checking if the proportional odds assumption holds for a cumulative logit model.

Maintained by Melissa Wong. Last updated 4 years ago.

3.4 match 1 stars 3.78 score 12 scripts

bms63

admiral.test:Test Data for the 'admiral' Package

A set of Study Data Tabulation Model (SDTM) datasets from the Clinical Data Interchange Standards Consortium (CDISC) pilot project used for testing and developing Analysis Data Model (ADaM) derivations inside the 'admiral' package.

Maintained by Ben Straub. Last updated 2 years ago.

4.0 match 3.19 score 310 scripts

bpfaff

QRM:Provides R-Language Code to Examine Quantitative Risk Management Concepts

Provides functions/methods to accompany the book Quantitative Risk Management: Concepts, Techniques and Tools by Alexander J. McNeil, Ruediger Frey, and Paul Embrechts.

Maintained by Bernhard Pfaff. Last updated 5 years ago.

cpp

2.8 match 4.53 score 181 scripts 5 dependents

bioc

celda:CEllular Latent Dirichlet Allocation

Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.

Maintained by Joshua Campbell. Last updated 30 days ago.

singlecell geneexpression clustering sequencing bayesian immunooncology dataimport cpp openmp

1.2 match 147 stars 10.47 score 256 scripts 2 dependents

bioc

igvR:igvR: integrative genomics viewer

Access to igv.js, the Integrative Genomics Viewer running in a web browser.

Maintained by Arkadiusz Gladki. Last updated 5 months ago.

visualization thirdpartyclient genomebrowsers

1.5 match 43 stars 8.31 score 118 scripts

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 11 months ago.

5.4 match 2.28 score 38 scripts

jonathanlees

ProfessR:Grades Setting and Exam Maker

Programs to determine student grades and create examinations from Question banks. Programs will create numerous multiple choice exams, randomly shuffled, for different versions of same question list.

Maintained by Jonathan M. Lees. Last updated 2 years ago.

4.1 match 2.91 score 41 scripts

predictiveecology

SpaDES.project:Project Templates Using 'SpaDES'

Quickly setup a 'SpaDES' project directories and add modules using templates.

Maintained by Eliot J B McIntire. Last updated 3 days ago.

1.8 match 3 stars 6.66 score 21 scripts

andrewhooker

PopED:Population (and Individual) Optimal Experimental Design

Optimal experimental designs for both population and individual studies based on nonlinear mixed-effect models. Often this is based on a computation of the Fisher Information Matrix. This package was developed for pharmacometric problems, and examples and predefined models are available for these types of systems. The methods are described in Nyberg et al. (2012) <doi:10.1016/j.cmpb.2012.05.005>, and Foracchia et al. (2004) <doi:10.1016/S0169-2607(03)00073-7>.

Maintained by Andrew C. Hooker. Last updated 5 months ago.

nlme optimal-design pharmacodynamics pharmacokinetics pharmacometrics pkpd population population-model

1.3 match 33 stars 9.58 score 300 scripts 1 dependents

emf-creaf

indicspecies:Relationship Between Species and Groups of Sites

Functions to assess the strength and statistical significance of the relationship between species occurrence/abundance and groups of sites [De Caceres & Legendre (2009) <doi:10.1890/08-1823.1>]. Also includes functions to measure species niche breadth using resource categories [De Caceres et al. (2011) <doi:10.1111/J.1600-0706.2011.19679.x>].

Maintained by Miquel De Cáceres. Last updated 26 days ago.

1.3 match 10 stars 9.49 score 386 scripts 4 dependents

biodiverse

ubms:Bayesian Models for Data from Unmarked Animals using 'Stan'

Fit Bayesian hierarchical models of animal abundance and occurrence via the 'rstan' package, the R interface to the 'Stan' C++ library. Supported models include single-season occupancy, dynamic occupancy, and N-mixture abundance models. Covariates on model parameters are specified using a formula-based interface similar to package 'unmarked', while also allowing for estimation of random slope and intercept terms. References: Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>; Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.

Maintained by Ken Kellner. Last updated 20 days ago.

distance-sampling hierarchical-models n-mixture-model occupancy stan openblas cpp

1.5 match 35 stars 7.88 score 73 scripts

willtownes

glmpca:Dimension Reduction of Non-Normally Distributed Data

Implements a generalized version of principal components analysis (GLM-PCA) for dimension reduction of non-normally distributed data such as counts or binary matrices. Townes FW, Hicks SC, Aryee MJ, Irizarry RA (2019) <doi:10.1186/s13059-019-1861-6>. Townes FW (2019) <arXiv:1907.02647>.

Maintained by F. William Townes. Last updated 11 months ago.

1.3 match 94 stars 9.24 score 258 scripts 4 dependents

jsta

wql:Exploring Water Quality Monitoring Data

Functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter.

Maintained by Jemma Stachelek. Last updated 2 months ago.

water-quality

1.6 match 12 stars 7.34 score 204 scripts 3 dependents

josue-rodriguez

psymetadata:Open Datasets from Meta-Analyses in Psychology Research

Data and examples from meta-analyses in psychology research.

Maintained by Josue E. Rodriguez. Last updated 2 years ago.

3.3 match 1 stars 3.40 score 50 scripts

mreginato

monographaR:Taxonomic Monographs Tools

Contains functions intended to facilitate the production of plant taxonomic monographs. The package includes functions to convert tables into taxonomic descriptions, lists of collectors, examined specimens, identification keys (dichotomous and interactive), and can generate a monograph skeleton. Additionally, wrapper functions to batch the production of phenology histograms and distributional and diversity maps are also available.

Maintained by Marcelo Reginato. Last updated 1 years ago.

2.4 match 3 stars 4.73 score 18 scripts

epiverse-trace

epidemics:Composable Epidemic Scenario Modelling

A library of compartmental epidemic models taken from the published literature, and classes to represent affected populations, public health response measures including non-pharmaceutical interventions on social contacts, non-pharmaceutical and pharmaceutical interventions that affect disease transmissibility, vaccination regimes, and disease seasonality, which can be combined to compose epidemic scenario models.

Maintained by Rosalind Eggo. Last updated 9 months ago.

decision-support epidemic-modelling epidemic-simulations epidemiology epiverse infectious-disease-dynamics model-library non-pharmaceutical-interventions rcpp rcppeigen scenario-analysis vaccination cpp

1.5 match 9 stars 7.48 score 59 scripts

bryanhanson

HiveR:2D and 3D Hive Plots for R

Creates and plots 2D and 3D hive plots. Hive plots are a unique method of displaying networks of many types in which node properties are mapped to axes using meaningful properties rather than being arbitrarily positioned. The hive plot concept was invented by Martin Krzywinski at the Genome Science Center (www.hiveplot.net/). Keywords: networks, food webs, linnet, systems biology, bioinformatics.

Maintained by Bryan A. Hanson. Last updated 8 months ago.

1.7 match 72 stars 6.76 score 53 scripts 2 dependents

friendly

candisc:Visualizing Generalized Canonical Discriminant and Canonical Correlation Analysis

Functions for computing and visualizing generalized canonical discriminant analyses and canonical correlation analysis for a multivariate linear model. Traditional canonical discriminant analysis is restricted to a one-way 'MANOVA' design and is equivalent to canonical correlation analysis between a set of quantitative response variables and a set of dummy variables coded from the factor variable. The 'candisc' package generalizes this to higher-way 'MANOVA' designs for all factors in a multivariate linear model, computing canonical scores and vectors for each term. The graphic functions provide low-rank (1D, 2D, 3D) visualizations of terms in an 'mlm' via the 'plot.candisc' and 'heplot.candisc' methods. Related plots are now provided for canonical correlation analysis when all predictors are quantitative.

Maintained by Michael Friendly. Last updated 11 months ago.

dimension-reduction multivariate-linear-models visualization

1.3 match 15 stars 8.86 score 221 scripts 3 dependents

ltierney

proftools:Profile Output Processing Tools for R

Tools for examining Rprof profile output.

Maintained by Luke Tierney. Last updated 5 years ago.

2.4 match 4.58 score 128 scripts 1 dependents

pvanlaake

ncdfCF:Easy Access to NetCDF Files with CF Metadata Conventions

Network Common Data Form ('netCDF') files are widely used for scientific data. Library-level access in R is provided through packages 'RNetCDF' and 'ncdf4'. Package 'ncdfCF' is built on top of 'RNetCDF' and makes the data and its attributes available as a set of R6 classes that are informed by the Climate and Forecasting Metadata Conventions. Access to the data uses standard R subsetting operators and common function forms.

Maintained by Patrick Van Laake. Last updated 4 days ago.

2.0 match 5.41 score 4 scripts

biooss

sensitivity:Global Sensitivity Analysis of Model Outputs and Importance Measures

A collection of functions for sensitivity analysis of model outputs (factor screening, global sensitivity analysis and robustness analysis), for variable importance measures of data, as well as for interpretability of machine learning models. Most of the functions have to be applied on scalar output, but several functions support multi-dimensional outputs.

Maintained by Bertrand Iooss. Last updated 7 months ago.

cpp

1.6 match 17 stars 6.74 score 472 scripts 8 dependents

usepa

tcpl:ToxCast Data Analysis Pipeline

The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.

Maintained by Jason Brown. Last updated 5 days ago.

ccte comptox ord

1.1 match 36 stars 9.41 score 90 scripts

dtharvey

eChem:Simulations for Electrochemistry Experiments

Simulates cyclic voltammetry, linear-sweep voltammetry (both with and without stirring of the solution), and single-pulse and double-pulse chronoamperometry and chronocoulometry experiments using the implicit finite difference method outlined in Gosser (1993, ISBN: 9781560810261) and in Brown (2015) <doi:10.1021/acs.jchemed.5b00225>. Additional functions provide ways to display and to examine the results of these simulations. The primary purpose of this package is to provide tools for use in courses in analytical chemistry.

Maintained by David Harvey. Last updated 6 years ago.

1.8 match 6 stars 6.11 score 27 scripts

bioc

ASpli:Analysis of Alternative Splicing Using RNA-Seq

Integrative pipeline for the analysis of alternative splicing using RNAseq.

Maintained by Ariel Chernomoretz. Last updated 5 months ago.

immunooncology geneexpression transcription alternativesplicing coverage differentialexpression differentialsplicing timecourse rnaseq genomeannotation sequencing alignment

2.0 match 5.33 score 45 scripts 1 dependents

paulrougieux

FAOSTAT:Download Data from the FAOSTAT Database

Download Data from the FAOSTAT Database of the Food and Agricultural Organization (FAO) of the United Nations. A list of functions to download statistics from FAOSTAT (database of the FAO <https://www.fao.org/faostat/>) and WDI (database of the World Bank <https://data.worldbank.org/>), and to perform some harmonization operations.

Maintained by Paul Rougieux. Last updated 7 months ago.

2.0 match 5.30 score 132 scripts

spiritspeak

AATtools:Reliability and Scoring Routines for the Approach-Avoidance Task

Compute approach bias scores using different scoring algorithms, compute bootstrapped and exact split-half reliability estimates, and compute confidence intervals for individual participant scores.

Maintained by Sercan Kahveci. Last updated 6 months ago.

3.5 match 1 stars 3.00 score 2 scripts

bioc

escape:Easy single cell analysis platform for enrichment

A bridging R package to facilitate gene set enrichment analysis (GSEA) in the context of single-cell RNA sequencing. Using raw count information, Seurat objects, or SingleCellExperiment format, users can perform and visualize ssGSEA, GSVA, AUCell, and UCell-based enrichment calculations across individual cells.

Maintained by Nick Borcherding. Last updated 2 months ago.

software singlecell classification annotation genesetenrichment sequencing genesignaling pathways

1.8 match 5.92 score 138 scripts

vaudigier

micemd:Multiple Imputation by Chained Equations with Multilevel Data

Addons for the 'mice' package to perform multiple imputation using chained equations with two-level data. Includes imputation methods dedicated to sporadically and systematically missing values. Imputation of continuous, binary or count variables are available. Following the recommendations of Audigier, V. et al (2018) <doi:10.1214/18-STS646>, the choice of the imputation method for each variable can be facilitated by a default choice tuned according to the structure of the incomplete dataset. Allows parallel calculation and overimputation for 'mice'.

Maintained by Vincent Audigier. Last updated 1 years ago.

3.3 match 1 stars 3.08 score 80 scripts 1 dependents

statmanrobin

Lock5Data:Datasets for "Statistics: UnLocking the Power of Data"

Datasets for the third edition of "Statistics: Unlocking the Power of Data" by Lock^5 Includes version of datasets from earlier editions.

Maintained by Robin Lock. Last updated 4 years ago.

3.4 match 2.90 score 322 scripts

david-cortes

outliertree:Explainable Outlier Detection Through Decision Tree Conditioning

Outlier detection method that flags suspicious values within observations, constrasting them against the normal values in a user-readable format, potentially describing conditions within the data that make a given outlier more rare. Full procedure is described in Cortes (2020) <doi:10.48550/arXiv.2001.00636>. Loosely based on the 'GritBot' <https://www.rulequest.com/gritbot-info.html> software.

Maintained by David Cortes. Last updated 2 months ago.

anomaly-detection outlier-detection cpp openmp

1.3 match 58 stars 7.34 score 21 scripts 2 dependents

emcramer

CHOIRBM:Plots the CHOIR Body Map

Collection of utility functions for visualizing body map data collected with the Collaborative Health Outcomes Information Registry.

Maintained by Eric Cramer. Last updated 1 years ago.

body-map cbm choir data-visualization visualization

1.7 match 5 stars 5.51 score 26 scripts

sfcheung

manymome:Mediation, Moderation and Moderated-Mediation After Model Fitting

Computes indirect effects, conditional effects, and conditional indirect effects in a structural equation model or path model after model fitting, with no need to define any user parameters or label any paths in the model syntax, using the approach presented in Cheung and Cheung (2024) <doi:10.3758/s13428-023-02224-z>. Can also form bootstrap confidence intervals by doing bootstrapping only once and reusing the bootstrap estimates in all subsequent computations. Supports bootstrap confidence intervals for standardized (partially or completely) indirect effects, conditional effects, and conditional indirect effects as described in Cheung (2009) <doi:10.3758/BRM.41.2.425> and Cheung, Cheung, Lau, Hui, and Vong (2022) <doi:10.1037/hea0001188>. Model fitting can be done by structural equation modeling using lavaan() or regression using lm().

Maintained by Shu Fai Cheung. Last updated 24 days ago.

bootstrapping confidence-interval lavaan manymome mediation moderated-mediation moderation regression sem standardized-effect-size structural-equation-modeling

1.2 match 1 stars 8.06 score 172 scripts 4 dependents

bioc

ACE:Absolute Copy Number Estimation from Low-coverage Whole Genome Sequencing

Uses segmented copy number data to estimate tumor cell percentage and produce copy number plots displaying absolute copy numbers.

Maintained by Jos B Poell. Last updated 5 months ago.

copynumbervariation dnaseq coverage wholegenome visualization sequencing

1.3 match 15 stars 7.03 score 18 scripts

polymerase3

vecmatch:Generalized Propensity Score Estimation and Matching for Multiple Groups

Implements the Vector Matching algorithm to match multiple treatment groups based on previously estimated generalized propensity scores. The package includes tools for visualizing initial confounder imbalances, estimating treatment assignment probabilities using various methods, defining the common support region, performing matching across multiple groups, and evaluating matching quality. For more details, see Lopez and Gutman (2017) <doi:10.1214/17-STS612>.

Maintained by Mateusz Kolek. Last updated 13 days ago.

1.9 match 1 stars 4.88 score

muschellij2

cforward:Forward Selection using Concordance/C-Index

Performs forward model selection, using the C-index/concordance in survival analysis models.

Maintained by John Muschelli. Last updated 4 years ago.

3.4 match 2.70 score 2 scripts

matthiaspucher

staRdom:PARAFAC Analysis of EEMs from DOM

'This is a user-friendly way to run a parallel factor (PARAFAC) analysis (Harshman, 1971) <doi:10.1121/1.1977523> on excitation emission matrix (EEM) data from dissolved organic matter (DOM) samples (Murphy et al., 2013) <doi:10.1039/c3ay41160e>. The analysis includes profound methods for model validation. Some additional functions allow the calculation of absorbance slope parameters and create beautiful plots.'

Maintained by Matthias Pucher. Last updated 4 months ago.

1.5 match 21 stars 6.03 score 86 scripts

bioc

msImpute:Imputation of label-free mass spectrometry peptides

MsImpute is a package for imputation of peptide intensity in proteomics experiments. It additionally contains tools for MAR/MNAR diagnosis and assessment of distortions to the probability distribution of the data post imputation. The missing values are imputed by low-rank approximation of the underlying data matrix if they are MAR (method = "v2"), by Barycenter approach if missingness is MNAR ("v2-mnar"), or by Peptide Identity Propagation (PIP).

Maintained by Soroor Hediyeh-zadeh. Last updated 5 months ago.

massspectrometry proteomics software label-free-proteomics low-rank-approximation

1.8 match 14 stars 5.15 score 7 scripts

glotaran

TIMP:Fitting Separable Nonlinear Models in Spectroscopy and Microscopy

A problem solving environment (PSE) for fitting separable nonlinear models to measurements arising in physics and chemistry experiments, as described by Mullen & van Stokkum (2007) <doi:10.18637/jss.v018.i03> for its use in fitting time resolved spectroscopy data, and as described by Laptenok et al. (2007) <doi:10.18637/jss.v018.i08> for its use in fitting Fluorescence Lifetime Imaging Microscopy (FLIM) data, in the study of Förster Resonance Energy Transfer (FRET). `TIMP` also serves as the computation backend for the `GloTarAn` software, a graphical user interface for the package, as described in Snellenburg et al. (2012) <doi:10.18637/jss.v049.i03>.

Maintained by Joris Snellenburg. Last updated 2 years ago.

parameter-estimation

1.9 match 3 stars 4.80 score 14 scripts 1 dependents

polkas

miceFast:Fast Imputations Using 'Rcpp' and 'Armadillo'

Fast imputations under the object-oriented programming paradigm. Moreover there are offered a few functions built to work with popular R packages such as 'data.table' or 'dplyr'. The biggest improvement in time performance could be achieve for a calculation where a grouping variable have to be used. A single evaluation of a quantitative model for the multiple imputations is another major enhancement. A new major improvement is one of the fastest predictive mean matching in the R world because of presorting and binary search.

Maintained by Maciej Nasinski. Last updated 1 months ago.

cpp fast fast-imputations grouping imputation imputations matrix mro multiple-imputation rcpp rcpparmadillo vif weighting openblas cpp openmp

1.5 match 20 stars 5.94 score 29 scripts

jacobbien

simulator:An Engine for Running Simulations

A framework for performing simulations such as those common in methodological statistics papers. The design principles of this package are described in greater depth in Bien, J. (2016) "The simulator: An Engine to Streamline Simulations," which is available at <arXiv:1607.00021>.

Maintained by Jacob Bien. Last updated 2 years ago.

simulation

1.3 match 52 stars 7.13 score 103 scripts

hanettools

Perc:Using Percolation and Conductance to Find Information Flow Certainty in a Direct Network

To find the certainty of dominance interactions with indirect interactions being considered.

Maintained by Jessica Vandeleest. Last updated 4 years ago.

1.5 match 5.88 score 38 scripts

afukushima

DiffCorr:Analyzing and Visualizing Differential Correlation Networks in Biological Data

A method for identifying pattern changes between 2 experimental conditions in correlation networks (e.g., gene co-expression networks), which builds on a commonly used association measure, such as Pearson's correlation coefficient. This package includes functions to calculate correlation matrices for high-dimensional dataset and to test differential correlation, which means the changes in the correlation relationship among variables (e.g., genes and metabolites) between 2 experimental conditions.

Maintained by Atsushi Fukushima. Last updated 6 months ago.

1.3 match 5 stars 6.81 score 29 scripts 1 dependents

bioc

ELViS:An R Package for Estimating Copy Number Levels of Viral Genome Segments Using Base-Resolution Read Depth Profile

Base-resolution copy number analysis of viral genome. Utilizes base-resolution read depth data over viral genome to find copy number segments with two-dimensional segmentation approach. Provides publish-ready figures, including histograms of read depths, coverage line plots over viral genome annotated with copy number change events and viral genes, and heatmaps showing multiple types of data with integrative clustering of samples.

Maintained by Jin-Young Lee. Last updated 14 days ago.

copynumbervariation coverage genomicvariation biomedicalinformatics sequencing normalization visualization clustering

1.8 match 4.70 score 7 scripts

bioc

lisaClust:lisaClust: Clustering of Local Indicators of Spatial Association

lisaClust provides a series of functions to identify and visualise regions of tissue where spatial associations between cell-types is similar. This package can be used to provide a high-level summary of cell-type colocalization in multiplexed imaging data that has been segmented at a single-cell resolution.

Maintained by Ellis Patrick. Last updated 4 months ago.

singlecell cellbasedassays spatial

1.3 match 3 stars 6.64 score 48 scripts

kassambara

survminer:Drawing Survival Curves using 'ggplot2'

Contains the function 'ggsurvplot()' for drawing easily beautiful and 'ready-to-publish' survival curves with the 'number at risk' table and 'censoring count plot'. Other functions are also available to plot adjusted curves for `Cox` model and to visually examine 'Cox' model assumptions.

Maintained by Alboukadel Kassambara. Last updated 5 months ago.

0.5 match 524 stars 15.87 score 7.0k scripts 55 dependents

njtierney

naniar:Data Structures, Summaries, and Visualisations for Missing Data

Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.

Maintained by Nicholas Tierney. Last updated 5 days ago.

data-visualisation ggplot2 missing-data missingness tidy-data

0.5 match 657 stars 15.63 score 5.1k scripts 9 dependents

griffithdan

cooccur:Probabilistic Species Co-Occurrence Analysis in R

This R package applies the probabilistic model of species co-occurrence (Veech 2013) to a set of species distributed among a set of survey or sampling sites. The algorithm calculates the observed and expected frequencies of co-occurrence between each pair of species. The expected frequency is based on the distribution of each species being random and independent of the other species. The analysis returns the probabilities that a more extreme (either low or high) value of co-occurrence could have been obtained by chance. The package also includes functions for visualizing species co-occurrence results and preparing data for downstream analyses.

Maintained by Daniel M. Griffith. Last updated 7 years ago.

1.7 match 3 stars 4.63 score 142 scripts

homerhanumat

tigerData:GC Statistics Datasets

A small, informal collection of datasets useful in undergraduate statistics courses.

Maintained by Homer White. Last updated 1 months ago.

3.6 match 2.18 score 6 scripts

sigbertklinke

exams.forge:Support for Compiling Examination Tasks using the 'exams' Package

The main aim is to further facilitate the creation of exercises based on the package 'exams' by Grün, B., and Zeileis, A. (2009) <doi:10.18637/jss.v029.i10>. Creating effective student exercises involves challenges such as creating appropriate data sets and ensuring access to intermediate values for accurate explanation of solutions. The functionality includes the generation of univariate and bivariate data including simple time series, functions for theoretical distributions and their approximation, statistical and mathematical calculations for tasks in basic statistics courses as well as general tasks such as string manipulation, LaTeX/HTML formatting and the editing of XML task files for 'Moodle'.

Maintained by Sigbert Klinke. Last updated 8 months ago.

2.9 match 2.70 score 1 scripts

ohdsi

LocalControl:Nonparametric Methods for Generating High Quality Comparative Effectiveness Evidence

Implements novel nonparametric approaches to address biases and confounding when comparing treatments or exposures in observational studies of outcomes. While designed and appropriate for use in studies involving medicine and the life sciences, the package can be used in other situations involving outcomes with multiple confounders. The package implements a family of methods for non-parametric bias correction when comparing treatments in observational studies, including survival analysis settings, where competing risks and/or censoring may be present. The approach extends to bias-corrected personalized predictions of treatment outcome differences, and analysis of heterogeneity of treatment effect-sizes across patient subgroups. For further details, please see: Lauve NR, Nelson SJ, Young SS, Obenchain RL, Lambert CG. LocalControl: An R Package for Comparative Safety and Effectiveness Research. Journal of Statistical Software. 2020. p. 1–32. Available from <doi:10.18637/jss.v096.i04>.

Maintained by Christophe G. Lambert. Last updated 4 months ago.

cpp

1.7 match 1 stars 4.56 score 12 scripts

hugaped

MBNMAtime:Run Time-Course Model-Based Network Meta-Analysis (MBNMA) Models

Fits Bayesian time-course models for model-based network meta-analysis (MBNMA) that allows inclusion of multiple time-points from studies. Repeated measures over time are accounted for within studies by applying different time-course functions, following the method of Pedder et al. (2019) <doi:10.1002/jrsm.1351>. The method allows synthesis of studies with multiple follow-up measurements that can account for time-course for a single or multiple treatment comparisons. Several general time-course functions are provided; others may be added by the user. Various characteristics can be flexibly added to the models, such as correlation between time points and shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting.

Maintained by Hugo Pedder. Last updated 1 months ago.

jags cpp

1.3 match 7 stars 6.10 score

cogdisreslab

KRSA:KRSA: Kinome Random Sampling Analyzer

The goal of this package is to analyze the PamChip data and identify the changes in the active kinome. The package can preprocess the PamChip data output from BioNavigator and use Random Sampling and Permutation Analysis to identify upstream kinases. Additionally, this package provides a set of useful visualizations for the PamChip data.

Maintained by Ali Sajid Imami. Last updated 12 days ago.

kinase phosphatases pamchip kinome random sampling permutation analysis

1.7 match 4 stars 4.42 score 49 scripts

horankev

sfislands:Streamlines the Process of Fitting Areal Spatial Models

Helpers for addressing the issue of disconnected spatial units. It allows for convenient adding and removal of neighbourhood connectivity between areal units prior to modelling, with the visual aid of maps. Post-modelling, it reduces the human workload for extracting, tidying and mapping predictions from areal models.

Maintained by Kevin Horan. Last updated 16 days ago.

1.8 match 7 stars 4.32 score 5 scripts

lindsayevanslee

whomds:Calculate Results from WHO Model Disability Survey Data

The Model Disability Survey (MDS) <https://www.who.int/activities/collection-of-data-on-disability> is a World Health Organization (WHO) general population survey instrument to assess the distribution of disability within a country or region, grounded in the International Classification of Functioning, Disability and Health <https://www.who.int/standards/classifications/international-classification-of-functioning-disability-and-health>. This package provides fit-for-purpose functions for calculating and presenting the results from this survey, as used by the WHO. The package primarily provides functions for implementing Rasch Analysis (see Andrich (2011) <doi:10.1586/erp.11.59>) to calculate a metric scale for disability.

Maintained by Lindsay Lee. Last updated 2 years ago.

1.3 match 4 stars 5.56 score 15 scripts

cran

mvdalab:Multivariate Data Analysis Laboratory

An open-source implementation of latent variable methods and multivariate modeling tools. The focus is on exploratory analyses using dimensionality reduction methods including low dimensional embedding, classical multivariate statistical tools, and tools for enhanced interpretation of machine learning methods (i.e. intelligible models to provide important information for end-users). Target domains include extension to dedicated applications e.g. for manufacturing process modeling, spectroscopic analyses, and data mining.

Maintained by Nelson Lee Afanador. Last updated 2 years ago.

3.4 match 2.18 score 1 dependents

bioc

RAIDS:Accurate Inference of Genetic Ancestry from Cancer Sequences

This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.

Maintained by Pascal Belleau. Last updated 5 months ago.

genetics software sequencing wholegenome principalcomponent geneticvariability dimensionreduction biocviews ancestry cancer-genomics exome-sequencing genomics inference r-language rna-seq rna-sequencing whole-genome-sequencing

1.2 match 5 stars 6.23 score 19 scripts

haddonm

MQMF:Modelling and Quantitative Methods in Fisheries

Complements the book "Using R for Modelling and Quantitative Methods in Fisheries" ISBN 9780367469894, published in 2021 by Chapman & Hall in their "Using R series". There are numerous functions and data-sets that are used in the book's many practical examples.

Maintained by Malcolm Haddon. Last updated 2 years ago.

ecology fisheries haddon quantitative-methods uncertainty

1.8 match 11 stars 4.14 score 25 scripts

bioc

visiumStitched:Enable downstream analysis of Visium capture areas stitched together with Fiji

This package provides helper functions for working with multiple Visium capture areas that overlap each other. This package was developed along with the companion example use case data available from https://github.com/LieberInstitute/visiumStitched_brain. visiumStitched prepares SpaceRanger (10x Genomics) output files so you can stitch the images from groups of capture areas together with Fiji. Then visiumStitched builds a SpatialExperiment object with the stitched data and makes an artificial hexogonal grid enabling the seamless use of spatial clustering methods that rely on such grid to identify neighboring spots, such as PRECAST and BayesSpace. The SpatialExperiment objects created by visiumStitched are compatible with spatialLIBD, which can be used to build interactive websites for stitched SpatialExperiment objects. visiumStitched also enables casting SpatialExperiment objects as Seurat objects.

Maintained by Nicholas J. Eagles. Last updated 3 months ago.

software spatial transcriptomics transcription geneexpression visualization dataimport 10xgenomics bioconductor spatial-transcriptomics spatialexperiment spatiallibd visium

1.3 match 1 stars 5.36 score 4 scripts

cran

svydiags:Regression Model Diagnostics for Survey Data

Diagnostics for fixed effects linear and general linear regression models fitted with survey data. Extensions of standard diagnostics to complex survey data are included: standardized residuals, leverages, Cook's D, dfbetas, dffits, condition indexes, and variance inflation factors as found in Li and Valliant (Surv. Meth., 2009, 35(1), pp. 15-24; Jnl. of Off. Stat., 2011, 27(1), pp. 99-119; Jnl. of Off. Stat., 2015, 31(1), pp. 61-75); Liao and Valliant (Surv. Meth., 2012, 38(1), pp. 53-62; Surv. Meth., 2012, 38(2), pp. 189-202). Variance inflation factors and condition indexes are also computed for some general linear models as described in Liao (U. Maryland thesis, 2010).

Maintained by Richard Valliant. Last updated 4 months ago.

3.4 match 1 stars 2.10 score 21 scripts

sdumble1

GGEBiplots:GGE Biplots with 'ggplot2'

Genotype plus genotype-by-environment (GGE) biplots rendered using 'ggplot2'. Provides a command line interface to all of the functionality contained within the archived package 'GGEBiplotGUI'.

Maintained by Sam Dumble. Last updated 3 years ago.

4.3 match 1.65 score 15 scripts 1 dependents

smwindecker

mixchar:Mixture Model for the Deconvolution of Thermal Decay Curves

Deconvolution of thermal decay curves allows you to quantify proportions of biomass components in plant litter. Thermal decay curves derived from thermogravimetric analysis (TGA) are imported, modified, and then modelled in a three- or four- part mixture model using the Fraser-Suzuki function. The output is estimates for weights of pseudo-components corresponding to hemicellulose, cellulose, and lignin. For more information see: Müller-Hagedorn, M. and Bockhorn, H. (2007) <doi:10.1016/j.jaap.2006.12.008>, Órfão, J. J. M. and Figueiredo, J. L. (2001) <doi:10.1016/S0040-6031(01)00634-7>, and Yang, H. and Yan, R. and Chen, H. and Zheng, C. and Lee, D. H. and Liang, D. T. (2006) <doi:10.1021/ef0580117>.

Maintained by Saras Windecker. Last updated 2 years ago.

decay thermogravimetry traits

1.3 match 10 stars 5.26 score 18 scripts

coolbutuseless

bitstreamio:Read and Write Bits from Files, Connections and Raw Vectors

Bit-level reading and writing are necessary when dealing with many file formats e.g. compressed data and binary files. Currently, R connections are manipulated at the byte level. This package wraps existing connections and raw vectors so that it is possible to read bits, bit sequences, unaligned bytes and low-bit representations of integers.

Maintained by Mike Cheng. Last updated 2 months ago.

1.7 match 3 stars 4.18 score 4 scripts

tiledb-inc

tiledbcloud:TileDB Cloud Platform R Client Package

The TileDB Cloud Platform API Client Package offers access to the TileDB Cloud service.

Maintained by John Kerl. Last updated 8 months ago.

1.3 match 1 stars 5.22 score 92 scripts

globalecologylab

paleopop:Pattern-Oriented Modeling Framework for Coupled Niche-Population Paleo-Climatic Models

This extension of the poems pattern-oriented modeling (POM) framework provides a collection of modules and functions customized for paleontological time-scales, and optimized for single-generation transitions and large populations, across multiple generations.

Maintained by July Pilowsky. Last updated 3 months ago.

1.3 match 5 stars 5.22 score 11 scripts

bioc

phenopath:Genomic trajectories with heterogeneous genetic and environmental backgrounds

PhenoPath infers genomic trajectories (pseudotimes) in the presence of heterogeneous genetic and environmental backgrounds and tests for interactions between them.

Maintained by Kieran Campbell. Last updated 5 months ago.

immunooncology rnaseq geneexpression bayesian singlecell principalcomponent cpp

1.5 match 4.63 score 43 scripts

damurka

cancerscreening:Streamline Access to Cancer Screening Data

Retrieve cancer screening data for cervical, breast and colorectal cancers from the Kenya Health Information System <https://hiskenya.org> in a consistent way.

Maintained by David Kariuki. Last updated 9 months ago.

cancerscreening khis

1.7 match 1 stars 4.00 score 2 scripts

xliaosdsu

csurvey:Constrained Regression for Survey Data

Domain mean estimation with monotonicity or block monotone constraints. See Xu X, Meyer MC and Opsomer JD (2021)<doi:10.1016/j.jspi.2021.02.004> for more details.

Maintained by Xiyue Liao. Last updated 26 days ago.

6.9 match 1.00 score

bioc

raer:RNA editing tools in R

Toolkit for identification and statistical testing of RNA editing signals from within R. Provides support for identifying sites from bulk-RNA and single cell RNA-seq datasets, and general methods for extraction of allelic read counts from alignment files. Facilitates annotation and exploratory analysis of editing signals using Bioconductor packages and resources.

Maintained by Kent Riemondy. Last updated 5 months ago.

multiplecomparison rnaseq singlecell sequencing coverage epitranscriptomics featureextraction annotation alignment bioconductor-package rna-seq-analysis single-cell-analysis single-cell-rna-seq curl bzip2 xz-utils zlib

1.1 match 8 stars 5.98 score 6 scripts

statnet

lolog:Latent Order Logistic Graph Models

Estimation of Latent Order Logistic (LOLOG) Models for Networks. LOLOGs are a flexible and fully general class of statistical graph models. This package provides functions for performing MOM, GMM and variational inference. Visual diagnostics and goodness of fit metrics are provided. See Fellows (2018) <arXiv:1804.04583> for a detailed description of the methods.

Maintained by Ian E. Fellows. Last updated 1 years ago.

cpp

1.2 match 5 stars 5.56 score 72 scripts

highamm

sptotal:Predicting Totals and Weighted Sums from Spatial Data

Performs predictions of totals and weighted sums, or finite population block kriging, on spatial data using the methods in Ver Hoef (2008) <doi:10.1007/s10651-007-0035-y>. The primary outputs are an estimate of the total, mean, or weighted sum in the region, an estimated prediction variance, and a plot of the predicted and observed values. This is useful primarily to users with ecological data that are counts or densities measured on some sites in a finite area of interest. Spatial prediction for the total count or average density in the entire region can then be done using the functions in this package.

Maintained by Matt Higham. Last updated 7 months ago.

1.3 match 4 stars 4.90 score 10 scripts

bioc

Damsel:Damsel: an end to end analysis of DamID

Damsel provides an end to end analysis of DamID data. Damsel takes bam files from Dam-only control and fusion samples and counts the reads matching to each GATC region. edgeR is utilised to identify regions of enrichment in the fusion relative to the control. Enriched regions are combined into peaks, and are associated with nearby genes. Damsel allows for IGV style plots to be built as the results build, inspired by ggcoverage, and using the functionality and layering ability of ggplot2. Damsel also conducts gene ontology testing with bias correction through goseq, and future versions of Damsel will also incorporate motif enrichment analysis. Overall, Damsel is the first package allowing for an end to end analysis with visual capabilities. The goal of Damsel was to bring all the analysis into one place, and allow for exploratory analysis within R.

Maintained by Caitlin Page. Last updated 5 months ago.

differentialmethylation peakdetection geneprediction genesetenrichment

1.2 match 5.34 score 20 scripts

bioc

spqn:Spatial quantile normalization

The spqn package implements spatial quantile normalization (SpQN). This method was developed to remove a mean-correlation relationship in correlation matrices built from gene expression data. It can serve as pre-processing step prior to a co-expression analysis.

Maintained by Yi Wang. Last updated 5 months ago.

networkinference graphandnetwork normalization

1.3 match 5 stars 5.04 score 22 scripts

tweedell

motoRneuron:Analyzing Paired Neuron Discharge Times for Time-Domain Synchronization

The temporal relationship between motor neurons can offer explanations for neural strategies. We combined functions to reduce neuron action potential discharge data and analyze it for short-term, time-domain synchronization. Even more so, motoRneuron combines most available methods for the determining cross correlation histogram peaks and most available indices for calculating synchronization into simple functions. See Nordstrom, Fuglevand, and Enoka (1992) <doi:10.1113/jphysiol.1992.sp019244> for a more thorough introduction.

Maintained by Andrew Tweedell. Last updated 6 years ago.

1.7 match 1 stars 3.74 score 11 scripts

jessieyeung

rcbayes:Estimate Rogers-Castro Migration Age Schedules with Bayesian Models

A collection of functions to estimate Rogers-Castro migration age schedules using 'Stan'. This model which describes the fundamental relationship between migration and age in the form of a flexible multi-exponential migration model was most notably proposed in Rogers and Castro (1978) <doi:10.1068/a100475>.

Maintained by Jessie Yeung. Last updated 1 years ago.

cpp

1.3 match 2 stars 4.60 score 8 scripts

prabhanjan-tattar

gpk:100 Data Sets for Statistics Education

Collection of datasets as prepared by Profs. A.P. Gore, S.A. Paranjape, and M.B. Kulkarni of Department of Statistics, Poona University, India. With their permission, first letter of their names forms the name of this package, the package has been built by me and made available for the benefit of R users. This collection requires a rich class of models and can be a very useful building block for a beginner.

Maintained by Prabhanjan Tattar. Last updated 12 years ago.

3.6 match 1.69 score 49 scripts

cran

ggscidca:Plotting Decision Curve Analysis with Coloured Bars

Decision curve analysis is a method for evaluating and comparing prediction models that incorporates clinical consequences, requires only the data set on which the models are tested, and can be applied to models that have either continuous or dichotomous results. The 'ggscidca' package adds coloured bars of discriminant relevance to the traditional decision curve. Improved practicality and aesthetics. This method was described by Balachandran VP (2015) <doi:10.1016/S1470-2045(14)71116-7>.

Maintained by Qiang Liu. Last updated 10 months ago.

3.8 match 1.60 score

larsenlab

hlaR:Tools for HLA Data

A streamlined tool for eplet analysis of donor and recipient HLA (human leukocyte antigen) mismatch. Messy, low-resolution HLA typing data is cleaned, and imputed to high-resolution using the NMDP (National Marrow Donor Program) haplotype reference database <https://haplostats.org/haplostats>. High resolution data is analyzed for overall or single antigen eplet mismatch using a reference table (currently supporting 'HLAMatchMaker' <http://www.epitopes.net> versions 2 and 3). Data can enter or exit the workflow at different points depending on the user's aims and initial data quality.

Maintained by Joan Zhang. Last updated 2 years ago.

1.2 match 7 stars 5.15 score 9 scripts

csqsiew

spreadr:Simulating Spreading Activation in a Network

The notion of spreading activation is a prevalent metaphor in the cognitive sciences. This package provides the tools for cognitive scientists and psychologists to conduct computer simulations that implement spreading activation in a network representation. The algorithmic method implemented in 'spreadr' subroutines follows the approach described in Vitevitch, Ercal, and Adagarla (2011, Frontiers), who viewed activation as a fixed cognitive resource that could spread among nodes that were connected to each other via edges or connections (i.e., a network). See Vitevitch, M. S., Ercal, G., & Adagarla, B. (2011). Simulating retrieval from a highly clustered network: Implications for spoken word recognition. Frontiers in Psychology, 2, 369. <doi:10.3389/fpsyg.2011.00369> and Siew, C. S. Q. (2019). spreadr: A R package to simulate spreading activation in a network. Behavior Research Methods, 51, 910-929. <doi: 10.3758/s13428-018-1186-5>.

Maintained by Cynthia Siew. Last updated 2 years ago.

cpp

1.5 match 8 stars 3.98 score 12 scripts

cran

SMPracticals:Practicals for Use with Davison (2003) Statistical Models

Contains the datasets and a few functions for use with the practicals outlined in Appendix A of the book Statistical Models (Davison, 2003, Cambridge University Press), which can be found at <doi:10.1017/CBO9780511815850>.

Maintained by Alessandra R. Brazzale. Last updated 1 years ago.

4.0 match 1.48 score 1 dependents

troyhill

coreCT:Programmatic Analysis of Sediment Cores Using Computed Tomography Imaging

Computed tomography (CT) imaging is a powerful tool for understanding the composition of sediment cores. This package streamlines and accelerates the analysis of CT data generated in the context of environmental science. Included are tools for processing raw DICOM images to characterize sediment composition (sand, peat, etc.). Root analyses are also enabled, including measures of external surface area and volumes for user-defined root size classes. For a detailed description of the application of computed tomography imaging for sediment characterization, see: Davey, E., C. Wigand, R. Johnson, K. Sundberg, J. Morris, and C. Roman. (2011) <DOI: 10.1890/10-2037.1>.

Maintained by Troy D. Hill. Last updated 4 years ago.

biomass computed-tomography sediment sediment-core

1.3 match 3 stars 4.41 score 17 scripts

bioc

hdxmsqc:An R package for quality Control for hydrogen deuterium exchange mass spectrometry experiments

The hdxmsqc package enables us to analyse and visualise the quality of HDX-MS experiments. Either as a final quality check before downstream analysis and publication or as part of a interative procedure to determine the quality of the data. The package builds on the QFeatures and Spectra packages to integrate with other mass-spectrometry data.

Maintained by Oliver M. Crook. Last updated 5 months ago.

qualitycontrol dataimport proteomics massspectrometry metabolomics

1.3 match 4.30 score 2 scripts

sfcheung

modelbpp:Model BIC Posterior Probability

Fits the neighboring models of a fitted structural equation model and assesses the model uncertainty of the fitted model based on BIC posterior probabilities, using the method presented in Wu, Cheung, and Leung (2020) <doi:10.1080/00273171.2019.1574546>.

Maintained by Shu Fai Cheung. Last updated 6 months ago.

lavaan model-comparison model-comparison-and-selection model-selection structural-equation-modeling

1.3 match 4.54 score 2 scripts

tdhock

directlabels:Direct Labels for Multicolor Plots

An extensible framework for automatically placing direct labels onto multicolor 'lattice' or 'ggplot2' plots. Label positions are described using Positioning Methods which can be re-used across several different plots. There are heuristics for examining "trellis" and "ggplot" objects and inferring an appropriate Positioning Method.

Maintained by Toby Dylan Hocking. Last updated 11 months ago.

0.5 match 83 stars 10.62 score 1.8k scripts 16 dependents

jabiru

tictoc:Functions for Timing R Scripts, as Well as Implementations of "Stack" and "StackList" Structures

Code execution timing functions 'tic' and 'toc' that can be nested. One can record all timings while a complex script is running, and examine the values later. It is also possible to instrument the timing calls with custom callbacks. In addition, this package provides class 'Stack', implemented as a vector, and class 'StackList', which is a stack implemented as a list, both of which support operations 'push', 'pop', 'first_element', 'last_element' and 'clear'.

Maintained by Sergei Izrailev. Last updated 12 months ago.

0.5 match 9 stars 10.45 score 12k scripts 47 dependents

ecohealthalliance

doltr:A client for the dolt database

Creates a DBI-compliant interface to dolt databases (<https://www.dolthub.com>). Also manages local dolt server processes, provides convenience functions for dolt versioning, and an RStudio connection pane interface.

Maintained by Noam Ross. Last updated 2 years ago.

1.8 match 17 stars 2.93 score 4 scripts

cjendres1

nhanesA:NHANES Data Retrieval

Utility to retrieve data from the National Health and Nutrition Examination Survey (NHANES) website <https://www.cdc.gov/nchs/nhanes/>.

Maintained by Christopher Endres. Last updated 2 months ago.

nhanes

0.5 match 59 stars 9.37 score 239 scripts

byandell-sysgen

qtl2ggplot:Data Visualization for QTL Experiments

Functions to plot QTL (quantitative trait loci) analysis results and related diagnostics. Part of 'qtl2', an upgrade of the 'qtl' package to better handle high-dimensional data and complex cross designs.

Maintained by Brian S Yandell. Last updated 1 years ago.

cpp

1.3 match 5 stars 3.93 score 17 scripts

cran

gammi:Generalized Additive Mixed Model Interface

An interface for fitting generalized additive models (GAMs) and generalized additive mixed models (GAMMs) using the 'lme4' package as the computational engine, as described in Helwig (2024) <doi:10.3390/stats7010003>. Supports default and formula methods for model specification, additive and tensor product splines for capturing nonlinear effects, and automatic determination of spline type based on the class of each predictor. Includes an S3 plot method for visualizing the (nonlinear) model terms, an S3 predict method for forming predictions from a fit model, and an S3 summary method for conducting significance testing using the Bayesian interpretation of a smoothing spline.

Maintained by Nathaniel E. Helwig. Last updated 2 months ago.

3.8 match 1.30 score

calvintchi

hierBipartite:Bipartite Graph-Based Hierarchical Clustering

Bipartite graph-based hierarchical clustering performs hierarchical clustering of groups of samples based on association patterns between two sets of variables. It is developed for pharmacogenomic datasets and datasets sharing the same data structure. In the context of pharmacogenomic datasets, the samples are cell lines, and the two sets of variables are typically expression levels and drug sensitivity values. For this method, sparse canonical correlation analysis from Lee, W., Lee, D., Lee, Y. and Pawitan, Y. (2011) <doi:10.2202/1544-6115.1638> is first applied to extract association patterns for each group of samples. Then, a nuclear norm-based dissimilarity measure is used to construct a dissimilarity matrix between groups based on the extracted associations. Finally, hierarchical clustering is applied.

Maintained by Calvin Chi. Last updated 4 years ago.

1.3 match 1 stars 3.70 score 4 scripts

cran

Platypus:Single-Cell Immune Repertoire and Gene Expression Analysis

We present 'Platypus', an open-source software platform providing a user-friendly interface to investigate B-cell receptor and T-cell receptor repertoires from scSeq experiments. 'Platypus' provides a framework to automate and ease the analysis of single-cell immune repertoires while also incorporating transcriptional information involving unsupervised clustering, gene expression and gene ontology. This R version of 'Platypus' is part of the 'ePlatypus' ecosystem for computational analysis of immunogenomics data: Yermanos et al. (2021) <doi:10.1093/nargab/lqab023>, Cotet et al. (2023) <doi:10.1093/bioinformatics/btad553>.

Maintained by Alexander Yermanos. Last updated 5 months ago.

1.3 match 3.70 score

forestscientist

StemAnalysis:Reconstructing Tree Growth and Carbon Accumulation with Stem Analysis Data

Use stem analysis data to reconstructing tree growth and carbon accumulation. Users can independently or in combination perform a number of standard tasks for any tree species. (i) Age class determination. (ii) The cumulative growth, mean annual increment, and current annual increment of diameter at breast height (DBH) with bark, tree height, and stem volume with bark are estimated. (iii) Tree biomass and carbon storage estimation from volume and allometric models are calculated. (iv) Height-diameter relationship is fitted with nonlinear models, if diameter at breast height (DBH) or tree height are available, which can be used to retrieve tree height and diameter at breast height (DBH). <https://github.com/forestscientist/StemAnalysis>.

Maintained by Huili Wu. Last updated 2 years ago.

1.0 match 4 stars 4.38 score 12 scripts

danieleweeks

Mega2R:Accessing and Processing a 'Mega2' Genetic Database

Uses as input genetic data that have been reformatted and stored in a 'SQLite' database; this database is initially created by the standalone 'mega2' C++ program (available freely from <https://watson.hgen.pitt.edu/register/>). Loads and manipulates data frames containing genotype, phenotype, and family information from the input 'SQLite' database, and decompresses needed subsets of the genotype data, on the fly, in a memory efficient manner. We have also created several more functions that illustrate how to use the data frames as well as perform useful tasks: these permit one to run the 'pedgene' package to carry out gene-based association tests on family data using selected marker subsets, to run the 'SKAT' package to carry out gene-based association tests using selected marker subsets, to run the 'famSKATRC' package to carry out gene-based association tests on families (optionally) and with rare or common variants using selected marker subsets, to output the 'Mega2R' data as a VCF file and related files (for phenotype and family data), and to convert the data frames into CoreArray Genomic Data Structure (GDS) format.

Maintained by Daniel E. Weeks. Last updated 1 years ago.

genetics cpp

2.3 match 2.00 score 8 scripts

cecileproust-lima

NormPsy:Normalisation of Psychometric Tests

Functions for normalizing psychometric test scores. The normalization aims at correcting the metrological properties of the psychometric tests such as the ceiling and floor effects and the curvilinearity (unequal interval scaling). Functions to compute and plot predictions in the natural scale of the psychometric test from the estimates of a linear mixed model estimated on the normalized scores are also provided. See Philipps et al (2014) <doi:10.1159/000365637> for details.

Maintained by Cecile Proust-Lima. Last updated 6 years ago.

fortran

1.7 match 2.36 score 23 scripts

cran

hypergate:Machine Learning of Hyperrectangular Gating Strategies for High-Dimensional Cytometry

Given a high-dimensional dataset that typically represents a cytometry dataset, and a subset of the datapoints, this algorithm outputs an hyperrectangle so that datapoints within the hyperrectangle best correspond to the specified subset. In essence, this allows the conversion of clustering algorithms' outputs to gating strategies outputs.

Maintained by Etienne Becht. Last updated 1 years ago.

1.5 match 2.70 score

gillian-earthscope

IRISSeismic:Classes and Methods for Seismic Data Analysis

Provides classes and methods for seismic data analysis. The base classes and methods are inspired by the python code found in the 'ObsPy' python toolbox <https://github.com/obspy/obspy>. Additional classes and methods support data returned by web services provided by EarthScope. <https://service.earthscope.org/>.

Maintained by Gillian Sharer. Last updated 3 months ago.

1.3 match 3.18 score 50 scripts 1 dependents

solivella

lda:Collapsed Gibbs Sampling Methods for Topic Models

Implements latent Dirichlet allocation (LDA) and related models. This includes (but is not limited to) sLDA, corrLDA, and the mixed-membership stochastic blockmodel. Inference for all of these models is implemented via a fast collapsed Gibbs sampler written in C. Utility functions for reading/writing data typically used in topic models, as well as tools for examining posterior distributions are also included.

Maintained by Santiago Olivella. Last updated 11 months ago.

0.5 match 7.62 score 548 scripts 11 dependents

cran

edcpR:Ecological Data Collection and Processing Package

This is the course package for the exercise portion of the "Ecological Data Collection and Processing" course.

Maintained by Ward Fonteyn. Last updated 3 years ago.

1.5 match 2.48 score

quicklizard99

cheddar:Analysis and Visualisation of Ecological Communities

Provides a flexible, extendable representation of an ecological community and a range of functions for analysis and visualisation, focusing on food web, body mass and numerical abundance data. Allows inter-web comparisons such as examining changes in community structure over environmental, temporal or spatial gradients.

Maintained by Lawrence Hudson. Last updated 8 months ago.

cpp

0.5 match 15 stars 6.86 score 195 scripts

mini-pw

SerolyzeR:Reading, Quality Control and Preprocessing of MBA (Multiplex Bead Assay) Data

Speeds up the process of loading raw data from MBA (Multiplex Bead Assay) examinations, performs quality control checks, and automatically normalises the data, preparing it for more advanced, downstream tasks. The main objective of the package is to create a simple environment for a user, who does not necessarily have experience with R language. The package is developed within the project 'PvSTATEM', which is an international project aiming for malaria elimination.

Maintained by Tymoteusz Kwiecinski. Last updated 18 days ago.

0.5 match 4 stars 6.68 score

mini-pw

PvSTATEM:Reading, Quality Control and Preprocessing of MBA (Multiplex Bead Assay) Data

Speeds up the process of loading raw data from MBA (Multiplex Bead Assay) examinations, performs quality control checks, and automatically normalises the data, preparing it for more advanced, downstream tasks. The main objective of the package is to create a simple environment for a user, who does not necessarily have experience with R language. The package is developed within the project of the same name - 'PvSTATEM', which is an international project aiming for malaria elimination.

Maintained by Tymoteusz Kwiecinski. Last updated 19 days ago.

0.5 match 3 stars 6.56 score 7 scripts

ksawicka

spup:Spatial Uncertainty Propagation Analysis

Uncertainty propagation analysis in spatial environmental modelling following methodology described in Heuvelink et al. (2007) <doi:10.1080/13658810601063951> and Brown and Heuvelink (2007) <doi:10.1016/j.cageo.2006.06.015>. The package provides functions for examining the uncertainty propagation starting from input data and model parameters, via the environmental model onto model outputs. The functions include uncertainty model specification, stochastic simulation and propagation of uncertainty using Monte Carlo (MC) techniques. Uncertain variables are described by probability distributions. Both numerical and categorical data types are handled. Spatial auto-correlation within an attribute and cross-correlation between attributes is accommodated for. The MC realizations may be used as input to the environmental models called from R, or externally.

Maintained by Kasia Sawicka. Last updated 1 years ago.

monte-carlo spatial uncertainty-analysis uncertainty-propagation

0.5 match 9 stars 6.31 score 57 scripts

iheid-library

iheiddown:For Writing Geneva Graduate Institute Documents

A set of tools for writing documents according to Geneva Graduate Institute conventions and regulations. The most common use is for writing and compiling theses or thesis chapters, as drafts or for examination with correct preamble formatting. However, the package also offers users to create HTML presentation slides with 'xaringan', complete problem sets, format posters, and, for course instructors, prepare a syllabus. The package includes additional functions for institutional color palettes, an institutional 'ggplot' theme, a function for counting manuscript words, and a bibliographical analysis toolkit.

Maintained by James Hollway. Last updated 2 years ago.

rmarkdown thesis university

0.5 match 11 stars 6.14 score 5 scripts

shgrky

MultiFit:Multiscale Fisher's Independence Test for Multivariate Dependence

Test for independence of two random vectors, learn and report the dependency structure. For more information, see Gorsky, Shai and Li Ma, Multiscale Fisher's Independence Test for Multivariate Dependence, Biometrika, accepted, January 2022.

Maintained by S. Gorsky. Last updated 3 years ago.

cpp

1.3 match 2.48 score 7 scripts

tim-tu

weibulltools:Statistical Methods for Life Data Analysis

Provides statistical methods and visualizations that are often used in reliability engineering. Comprises a compact and easily accessible set of methods and visualization tools that make the examination and adjustment as well as the analysis and interpretation of field data (and bench tests) as simple as possible. Non-parametric estimators like Median Ranks, Kaplan-Meier (Abernethy, 2006, <ISBN:978-0-9653062-3-2>), Johnson (Johnson, 1964, <ISBN:978-0444403223>), and Nelson-Aalen for failure probability estimation within samples that contain failures as well as censored data are included. The package supports methods like Maximum Likelihood and Rank Regression, (Genschel and Meeker, 2010, <DOI:10.1080/08982112.2010.503447>) for the estimation of multiple parametric lifetime distributions, as well as the computation of confidence intervals of quantiles and probabilities using the delta method related to Fisher's confidence intervals (Meeker and Escobar, 1998, <ISBN:9780471673279>) and the beta-binomial confidence bounds. If desired, mixture model analysis can be done with segmented regression and the EM algorithm. Besides the well-known Weibull analysis, the package also contains Monte Carlo methods for the correction and completion of imprecisely recorded or unknown lifetime characteristics. (Verband der Automobilindustrie e.V. (VDA), 2016, <ISSN:0943-9412>). Plots are created statically ('ggplot2') or interactively ('plotly') and can be customized with functions of the respective visualization package. The graphical technique of probability plotting as well as the addition of regression lines and confidence bounds to existing plots are supported.

Maintained by Tim-Gunnar Hensel. Last updated 2 years ago.

field-data-analysis interactive-visualizations plotly reliability-analysis weibull-analysis weibulltools openblas cpp

0.5 match 13 stars 6.15 score 54 scripts

adam-king

brea:Bayesian Recurrent Events Analysis

Functions to produce MCMC samples for posterior inference in semiparametric Bayesian discrete time competing risks recurrent events models and multistate models.

Maintained by Adam J King. Last updated 8 months ago.

1.3 match 2.30 score 3 scripts

prajwalkpatil

VedicDateTime:Vedic Calendar System

Provides platform for Vedic calendar system having several functionalities to facilitate conversion between Gregorian and Vedic calendar systems, and helpful in examining its impact in the time series analysis domain.

Maintained by Neeraj Dhanraj Bokde. Last updated 1 years ago.

calendar panchanga time-series vedic

0.5 match 6 stars 5.84 score 58 scripts

stefan-schroedl

plotluck:'ggplot2' Version of "I'm Feeling Lucky!"

Examines the characteristics of a data frame and a formula to automatically choose the most suitable type of plot out of the following supported options: scatter, violin, box, bar, density, hexagon bin, spine plot, and heat map. The aim of the package is to let the user focus on what to plot, rather than on the "how" during exploratory data analysis. It also automates handling of observation weights, logarithmic axis scaling, reordering of factor levels, and overlaying smoothing curves and median lines. Plots are drawn using 'ggplot2'.

Maintained by Stefan Schroedl. Last updated 2 years ago.

automatic box-plot boxplot cleveland data-visualization density-plot ggplot2 heatmap hexbin plot scatter scatter-plot spine-plot violin-plot visualization

0.5 match 52 stars 5.96 score 35 scripts

mindthegap-erc

StratPal:Stratigraphic Paleobiology Modeling Pipelines

The fossil record is a joint expression of ecological, taphonomic, evolutionary, and stratigraphic processes (Holland and Patzkowsky, 2012, ISBN:978-0226649382). This package allowing to simulate biological processes in the time domain (e.g., trait evolution, fossil abundance), and examine how their expression in the rock record (stratigraphic domain) is influenced based on age-depth models, ecological niche models, and taphonomic effects. Functions simulating common processes used in modeling trait evolution or event type data such as first/last occurrences are provided and can be used standalone or as part of a pipeline. The package comes with example data sets and tutorials in several vignettes, which can be used as a template to set up one's own simulation.

Maintained by Niklas Hohmann. Last updated 26 days ago.

palaeobiology palaeontology paleobiology paleontology stratigraphic-paleobiology stratigraphy

0.5 match 1 stars 5.88 score 18 scripts

certara-jcraig

Certara.RsNLME.ModelBuilder:Pharmacometric Model Building Using 'shiny'

Develop Nonlinear Mixed Effects (NLME) models for pharmacometrics using a 'shiny' interface. The Pharmacometric Modeling Language (PML) code updates in real time given changes to user inputs. Models can be executed using the 'Certara.RsNLME' package. Additional support to generate the underlying 'Certara.RsNLME' code to recreate the corresponding model in R is provided in the user interface.

Maintained by James Craig. Last updated 3 months ago.

1.7 match 1.70 score 8 scripts

lleisong

itsdm:Isolation Forest-Based Presence-Only Species Distribution Modeling

Collection of R functions to do purely presence-only species distribution modeling with isolation forest (iForest) and its variations such as Extended isolation forest and SCiForest. See the details of these methods in references: Liu, F.T., Ting, K.M. and Zhou, Z.H. (2008) <doi:10.1109/ICDM.2008.17>, Hariri, S., Kind, M.C. and Brunner, R.J. (2019) <doi:10.1109/TKDE.2019.2947676>, Liu, F.T., Ting, K.M. and Zhou, Z.H. (2010) <doi:10.1007/978-3-642-15883-4_18>, Guha, S., Mishra, N., Roy, G. and Schrijvers, O. (2016) <https://proceedings.mlr.press/v48/guha16.html>, Cortes, D. (2021) <arXiv:2110.13402>. Additionally, Shapley values are used to explain model inputs and outputs. See details in references: Shapley, L.S. (1953) <doi:10.1515/9781400881970-018>, Lundberg, S.M. and Lee, S.I. (2017) <https://dl.acm.org/doi/abs/10.5555/3295222.3295230>, Molnar, C. (2020) <ISBN:978-0-244-76852-2>, Štrumbelj, E. and Kononenko, I. (2014) <doi:10.1007/s10115-013-0679-x>. itsdm also provides functions to diagnose variable response, analyze variable importance, draw spatial dependence of variables and examine variable contribution. As utilities, the package includes a few functions to download bioclimatic variables including 'WorldClim' version 2.0 (see Fick, S.E. and Hijmans, R.J. (2017) <doi:10.1002/joc.5086>) and 'CMCC-BioClimInd' (see Noce, S., Caporaso, L. and Santini, M. (2020) <doi:10.1038/s41597-020-00726-5>.

Maintained by Lei Song. Last updated 2 years ago.

isolation-forest outlier-detection presence-onlymodel shapley-value species-distribution-modelling

0.5 match 4 stars 5.59 score 65 scripts

thlytras

rspiro:Implementation of Spirometry Equations

Implementation of various spirometry equations in R, currently the GLI-2012 (Global Lung Initiative; Quanjer et al. 2012 <doi:10.1183/09031936.00080312>), the race-neutral GLI global 2022 (Global Lung Initiative; Bowerman et al. 2023 <doi:10.1164/rccm.202205-0963OC>), the NHANES3 (National Health and Nutrition Examination Survey; Hankinson et al. 1999 <doi:10.1164/ajrccm.159.1.9712108>) and the JRS 2014 (Japanese Respiratory Society; Kubota et al. 2014 <doi:10.1016/j.resinv.2014.03.003>) equations. Also the GLI-2017 diffusing capacity equations <doi:10.1183/13993003.00010-2017> are implemented. Contains user-friendly functions to calculate predicted and LLN (Lower Limit of Normal) values for different spirometric parameters such as FEV1 (Forced Expiratory Volume in 1 second), FVC (Forced Vital Capacity), etc, and to convert absolute spirometry measurements to percent (%) predicted and z-scores.

Maintained by Theodore Lytras. Last updated 1 days ago.

0.5 match 15 stars 5.53 score 28 scripts

bioc

scDotPlot:Cluster a Single-cell RNA-seq Dot Plot

Dot plots of single-cell RNA-seq data allow for an examination of the relationships between cell groupings (e.g. clusters) and marker gene expression. The scDotPlot package offers a unified approach to perform a hierarchical clustering analysis and add annotations to the columns and/or rows of a scRNA-seq dot plot. It works with SingleCellExperiment and Seurat objects as well as data frames.

Maintained by Benjamin I Laufer. Last updated 5 months ago.

software visualization differentialexpression geneexpression transcription rnaseq singlecell sequencing clustering

0.5 match 7 stars 5.39 score 2 scripts

datalorax

esvis:Visualization and Estimation of Effect Sizes

A variety of methods are provided to estimate and visualize distributional differences in terms of effect sizes. Particular emphasis is upon evaluating differences between two or more distributions across the entire scale, rather than at a single point (e.g., differences in means). For example, Probability-Probability (PP) plots display the difference between two or more distributions, matched by their empirical CDFs (see Ho and Reardon, 2012; <doi:10.3102/1076998611411918>), allowing for examinations of where on the scale distributional differences are largest or smallest. The area under the PP curve (AUC) is an effect-size metric, corresponding to the probability that a randomly selected observation from the x-axis distribution will have a higher value than a randomly selected observation from the y-axis distribution. Binned effect size plots are also available, in which the distributions are split into bins (set by the user) and separate effect sizes (Cohen's d) are produced for each bin - again providing a means to evaluate the consistency (or lack thereof) of the difference between two or more distributions at different points on the scale. Evaluation of empirical CDFs is also provided, with built-in arguments for providing annotations to help evaluate distributional differences at specific points (e.g., semi-transparent shading). All function take a consistent argument structure. Calculation of specific effect sizes is also possible. The following effect sizes are estimable: (a) Cohen's d, (b) Hedges' g, (c) percentage above a cut, (d) transformed (normalized) percentage above a cut, (e) area under the PP curve, and (f) the V statistic (see Ho, 2009; <doi:10.3102/1076998609332755>), which essentially transforms the area under the curve to standard deviation units. By default, effect sizes are calculated for all possible pairwise comparisons, but a reference group (distribution) can be specified.

Maintained by Daniel Anderson. Last updated 5 years ago.

visualization

0.5 match 51 stars 5.43 score 53 scripts

hurlej2

echo.find:Finding Rhythms Using Extended Circadian Harmonic Oscillators (ECHO)

Provides a function (echo_find()) designed to find rhythms from data using extended harmonic oscillators. For more information, see H. De los Santos et al. (2020) <doi:10.1093/bioinformatics/btz617> .

Maintained by Jennifer Hurley. Last updated 5 years ago.

1.3 match 2.00 score 5 scripts

gfcm

MSSQL:Tools to Work with Microsoft SQL Server Databases via 'RODBC'

Tools that extend the functionality of the 'RODBC' package to work with Microsoft SQL Server databases. Makes it easier to browse the database and examine individual tables and views.

Maintained by Arni Magnusson. Last updated 5 months ago.

0.5 match 2 stars 5.00 score 2 scripts

tesselle

tabula:Analysis and Visualization of Archaeological Count Data

An easy way to examine archaeological count data. This package provides several tests and measures of diversity: heterogeneity and evenness (Brillouin, Shannon, Simpson, etc.), richness and rarefaction (Chao1, Chao2, ACE, ICE, etc.), turnover and similarity (Brainerd-Robinson, etc.). It allows to easily visualize count data and statistical thresholds: rank vs abundance plots, heatmaps, Ford (1962) and Bertin (1977) diagrams, etc.

Maintained by Nicolas Frerebeau. Last updated 14 days ago.

data-visualization archaeology archaeological-science

0.5 match 5.10 score 38 scripts 1 dependents

dereksonderegger

SiZer:Significant Zero Crossings

Calculates and plots the SiZer map for scatterplot data. A SiZer map is a way of examining when the p-th derivative of a scatterplot-smoother is significantly negative, possibly zero or significantly positive across a range of smoothing bandwidths.

Maintained by Derek Sonderegger. Last updated 3 years ago.

0.5 match 1 stars 5.00 score 33 scripts 2 dependents

ruben-hernan

CatDyn:Fishery Stock Assessment by Catch Dynamics Models

Based on fishery Catch Dynamics instead of fish Population Dynamics (hence CatDyn) and using high-frequency or medium-frequency catch in biomass or numbers, fishing nominal effort, and mean fish body weight by time step, from one or two fishing fleets, estimate stock abundance, natural mortality rate, and fishing operational parameters. It includes methods for data organization, plotting standard exploratory and analytical plots, predictions, for 100 types of models of increasing complexity, and 72 likelihood models for the data.

Maintained by Ruben H. Roa-Ureta. Last updated 6 years ago.

1.8 match 1 stars 1.45 score 28 scripts

mmcarli

groupWQS:Grouped Weighted Quantile Sum Regression

Fits weighted quantile sum (WQS) regressions for one or more chemical groups with continuous or binary outcomes. Wheeler D, Czarnota J.(2016) <doi:10.1289/isee.2016.4698>.

Maintained by Matthew Carli. Last updated 5 years ago.

jags cpp

1.3 match 2.00 score 4 scripts

jli-stat

SparseMDC:Implementation of SparseMDC Algorithm

Implements the algorithm described in Barron, M., and Li, J. (Not yet published). This algorithm clusters samples from multiple ordered populations, links the clusters across the conditions and identifies marker genes for these changes. The package was designed for scRNA-Seq data but is also applicable to many other data types, just replace cells with samples and genes with variables. The package also contains functions for estimating the parameters for SparseMDC as outlined in the paper. We recommend that users further select their marker genes using the magnitude of the cluster centers.

Maintained by Jun Li. Last updated 7 years ago.

1.3 match 2.00 score

meghapsimatrix

wildmeta:Cluster Wild Bootstrapping for Meta-Analysis

Conducts single coefficient tests and multiple-contrast hypothesis tests of meta-regression models using cluster wild bootstrapping, based on methods examined in Joshi, Pustejovsky, and Beretvas (2022) <DOI:10.1002/jrsm.1554>.

Maintained by Megha Joshi. Last updated 2 months ago.

0.5 match 9 stars 4.82 score 14 scripts

lbbe-software

rbioacc:Inference and Prediction of ToxicoKinetic (TK) Models

The MOSAICbioacc application is a turnkey package providing bioaccumulation factors (BCF/BMF/BSAF) from a toxicokinetic (TK) model fitted to accumulation-depuration data. It is designed to fulfil the requirements of regulators when examining applications for market authorization of active substances. See Ratier et al. (2021) <doi:10.1101/2021.09.08.459421>.

Maintained by Virgile Baudrot. Last updated 1 years ago.

cpp

0.5 match 4.78 score 8 scripts