R-universe search: replicability

bschneidr

svrep:Tools for Creating, Updating, and Analyzing Survey Replicate Weights

Provides tools for creating and working with survey replicate weights, extending functionality of the 'survey' package from Lumley (2004) <doi:10.18637/jss.v009.i08>. Implements bootstrap methods for complex surveys, including the generalized survey bootstrap as described by Beaumont and Patak (2012) <doi:10.1111/j.1751-5823.2011.00166.x>. Methods are provided for applying nonresponse adjustments to both full-sample and replicate weights as described by Rust and Rao (1996) <doi:10.1177/096228029600500305>. Implements methods for sample-based calibration described by Opsomer and Erciulescu (2021) <https://www150.statcan.gc.ca/n1/pub/12-001-x/2021002/article/00006-eng.htm>. Diagnostic functions are included to compare weights and weighted estimates from different sets of replicate weights.

Maintained by Ben Schneider. Last updated 7 days ago.

54.6 match 8 stars 8.12 score 54 scripts 3 dependents

mightymetrika

npboottprm:Nonparametric Bootstrap Test with Pooled Resampling

Addressing crucial research questions often necessitates a small sample size due to factors such as distinctive target populations, rarity of the event under study, time and cost constraints, ethical concerns, or group-level unit of analysis. Many readily available analytic methods, however, do not accommodate small sample sizes, and the choice of the best method can be unclear. The 'npboottprm' package enables the execution of nonparametric bootstrap tests with pooled resampling to help fill this gap. Grounded in the statistical methods for small sample size studies detailed in Dwivedi, Mallawaarachchi, and Alvarado (2017) <doi:10.1002/sim.7263>, the package facilitates a range of statistical tests, encompassing independent t-tests, paired t-tests, and one-way Analysis of Variance (ANOVA) F-tests. The nonparboot() function undertakes essential computations, yielding detailed outputs which include test statistics, effect sizes, confidence intervals, and bootstrap distributions. Further, 'npboottprm' incorporates an interactive 'shiny' web application, nonparboot_app(), offering intuitive, user-friendly data exploration.

Maintained by Mackson Ncube. Last updated 6 months ago.

datascience nonparametric statistics

93.2 match 1 stars 4.32 score 5 scripts 2 dependents

r-forge

survey:Analysis of Complex Survey Samples

Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.

Maintained by "Thomas Lumley". Last updated 6 months ago.

cpp

15.6 match 1 stars 13.93 score 13k scripts 235 dependents

helmut01

replicateBE:Average Bioequivalence with Expanding Limits (ABEL)

Performs comparative bioavailability calculations for Average Bioequivalence with Expanding Limits (ABEL). Implemented are 'Method A' / 'Method B' and the detection of outliers. If the design allows, assessment of the empiric Type I Error and iteratively adjusting alpha to control the consumer risk. Average Bioequivalence - optionally with a tighter (narrow therapeutic index drugs) or wider acceptance range (South Africa: Cmax) - is implemented as well.

Maintained by Helmut Schütz. Last updated 3 years ago.

bioequivalence biostatistics

32.6 match 9 stars 4.65 score 10 scripts

crsuzh

ReplicationSuccess:Design and Analysis of Replication Studies

Provides utilities for the design and analysis of replication studies. Features both traditional methods based on statistical significance and more recent methods such as the sceptical p-value; Held L. (2020) <doi:10.1111/rssa.12493>, Held et al. (2022) <doi:10.1214/21-AOAS1502>, Micheloud et al. (2023) <doi:10.1111/stan.12312>. Also provides related methods including the harmonic mean chi-squared test; Held, L. (2020) <doi:10.1111/rssc.12410>, and intrinsic credibility; Held, L. (2019) <doi:10.1098/rsos.181534>. Contains datasets from five large-scale replication projects.

Maintained by Samuel Pawel. Last updated 5 months ago.

30.0 match 1 stars 5.02 score 35 scripts

bioc

ChIPpeakAnno:Batch annotation of the peaks identified from either ChIP-seq, ChIP-chip experiments, or any experiments that result in large number of genomic interval data

The package encompasses a range of functions for identifying the closest gene, exon, miRNA, or custom features—such as highly conserved elements and user-supplied transcription factor binding sites. Additionally, users can retrieve sequences around the peaks and obtain enriched Gene Ontology (GO) or Pathway terms. In version 2.0.5 and beyond, new functionalities have been introduced. These include features for identifying peaks associated with bi-directional promoters along with summary statistics (peaksNearBDP), summarizing motif occurrences in peaks (summarizePatternInPeaks), and associating additional identifiers with annotated peaks or enrichedGO (addGeneIDs). The package integrates with various other packages such as biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest, and stat to enhance its analytical capabilities.

Maintained by Jianhong Ou. Last updated 2 months ago.

annotation chipseq chipchip

15.1 match 8.75 score 584 scripts 6 dependents

hadley

plyr:Tools for Splitting, Applying and Combining Data

A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.

Maintained by Hadley Wickham. Last updated 4 months ago.

cpp

7.2 match 500 stars 18.16 score 83k scripts 3.3k dependents

samch93

BayesRepDesign:Bayesian Design of Replication Studies

Provides functionality for determining the sample size of replication studies using Bayesian design approaches in the normal-normal hierarchical model (Pawel et al., 2023) <doi:10.1037/met0000604>.

Maintained by Samuel Pawel. Last updated 1 years ago.

38.7 match 3 stars 3.18 score 4 scripts

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

8.0 match 71 stars 14.95 score 670 scripts 127 dependents

spatstat

spatstat.data:Datasets for 'spatstat' Family

Contains all the datasets for the 'spatstat' family of packages.

Maintained by Adrian Baddeley. Last updated 4 hours ago.

kernel-density point-process spatial-analysis spatial-data spatial-data-analysis spatstat statistical-analysis statistical-methods statistical-tests statistics

10.1 match 6 stars 11.07 score 186 scripts 228 dependents

calvagone

campsismod:Generic Implementation of a PK/PD Model

A generic, easy-to-use and expandable implementation of a pharmacokinetic (PK) / pharmacodynamic (PD) model based on the S4 class system. This package allows the user to read/write a pharmacometric model from/to files and adapt it further on the fly in the R environment. For this purpose, this package provides an intuitive API to add, modify or delete equations, ordinary differential equations (ODE's), model parameters or compartment properties (like infusion duration or rate, bioavailability and initial values). Finally, this package also provides a useful export of the model for use with simulation packages 'rxode2' and 'mrgsolve'. This package is designed and intended to be used with package 'campsis', a PK/PD simulation platform built on top of 'rxode2' and 'mrgsolve'.

Maintained by Nicolas Luyckx. Last updated 1 months ago.

16.4 match 5 stars 6.64 score 42 scripts 1 dependents

dgbonett

vcmeta:Varying Coefficient Meta-Analysis

Implements functions for varying coefficient meta-analysis methods. These methods do not assume effect size homogeneity. Subgroup effect size comparisons, general linear effect size contrasts, and linear models of effect sizes based on varying coefficient methods can be used to describe effect size heterogeneity. Varying coefficient meta-analysis methods do not require the unrealistic assumptions of the traditional fixed-effect and random-effects meta-analysis methods. For details see: Statistical Methods for Psychologists, Volume 5, <https://dgbonett.sites.ucsc.edu/>.

Maintained by Douglas G. Bonett. Last updated 8 months ago.

34.8 match 1 stars 3.00 score 8 scripts

replicable

htm2txt:Convert Html into Text

Convert a html document to plain texts by stripping off all html tags.

Maintained by Sangchul Park. Last updated 4 months ago.

20.0 match 3 stars 5.05 score 55 scripts 2 dependents

detlew

PowerTOST:Power and Sample Size for (Bio)Equivalence Studies

Contains functions to calculate power and sample size for various study designs used in bioequivalence studies. Use known.designs() to see the designs supported. Power and sample size can be obtained based on different methods, amongst them prominently the TOST procedure (two one-sided t-tests). See README and NEWS for further information.

Maintained by Detlew Labes. Last updated 12 months ago.

10.3 match 20 stars 9.61 score 112 scripts 4 dependents

pcruniversum

chipPCR:Toolkit of Helper Functions to Pre-Process Amplification Data

A collection of functions to pre-process amplification curve data from polymerase chain reaction (PCR) or isothermal amplification reactions. Contains functions to normalize and baseline amplification curves, to detect both the start and end of an amplification reaction, several smoothers (e.g., LOWESS, moving average, cubic splines, Savitzky-Golay), a function to detect false positive amplification reactions and a function to determine the amplification efficiency. Quantification point (Cq) methods include the first (FDM) and second approximate derivative maximum (SDM) methods (calculated by a 5-point-stencil) and the cycle threshold method. Data sets of experimental nucleic acid amplification systems ('VideoScan HCU', capillary convective PCR (ccPCR)) and commercial systems are included. Amplification curves were generated by helicase dependent amplification (HDA), ccPCR or PCR. As detection system intercalating dyes (EvaGreen, SYBR Green) and hydrolysis probes (TaqMan) were used. For more information see: Roediger et al. (2015) <doi:10.1093/bioinformatics/btv205>.

Maintained by Stefan Roediger. Last updated 4 years ago.

14.4 match 8 stars 6.84 score 97 scripts 1 dependents

eddelbuettel

nanotime:Nanosecond-Resolution Time Support for R

Full 64-bit resolution date and time functionality with nanosecond granularity is provided, with easy transition to and from the standard 'POSIXct' type. Three additional classes offer interval, period and duration functionality for nanosecond-resolution timestamps.

Maintained by Dirk Eddelbuettel. Last updated 1 months ago.

datetime datetimes nanosecond-resolution nanoseconds cpp

9.0 match 53 stars 10.91 score 134 scripts 17 dependents

didiermurillof

FielDHub:A Shiny App for Design of Experiments in Life Sciences

A shiny design of experiments (DOE) app that aids in the creation of traditional, un-replicated, augmented and partially-replicated designs applied to agriculture, plant breeding, forestry, animal and biological sciences.

Maintained by Didier Murillo. Last updated 8 months ago.

agricultural breeding design doe experimental plantbreeding shiny

10.8 match 48 stars 9.10 score 70 scripts 1 dependents

wraff

wrMisc:Analyze Experimental High-Throughput (Omics) Data

The efficient treatment and convenient analysis of experimental high-throughput (omics) data gets facilitated through this collection of diverse functions. Several functions address advanced object-conversions, like manipulating lists of lists or lists of arrays, reorganizing lists to arrays or into separate vectors, merging of multiple entries, etc. Another set of functions provides speed-optimized calculation of standard deviation (sd), coefficient of variance (CV) or standard error of the mean (SEM) for data in matrixes or means per line with respect to additional grouping (eg n groups of replicates). A group of functions facilitate dealing with non-redundant information, by indexing unique, adding counters to redundant or eliminating lines with respect redundancy in a given reference-column, etc. Help is provided to identify very closely matching numeric values to generate (partial) distance matrixes for very big data in a memory efficient manner or to reduce the complexity of large data-sets by combining very close values. Other functions help aligning a matrix or data.frame to a reference using partial matching or to mine an experimental setup to extract patterns of replicate samples. Many times large experimental datasets need some additional filtering, adequate functions are provided. Convenient data normalization is supported in various different modes, parameter estimation via permutations or boot-strap as well as flexible testing of multiple pair-wise combinations using the framework of 'limma' is provided, too. Batch reading (or writing) of sets of files and combining data to arrays is supported, too.

Maintained by Wolfgang Raffelsberger. Last updated 7 months ago.

21.8 match 4.44 score 33 scripts 4 dependents

tidyverse

modelr:Modelling Functions that Work with the Pipe

Functions for modelling that help you seamlessly integrate modelling into a pipeline of data manipulation and visualisation.

Maintained by Hadley Wickham. Last updated 1 years ago.

modelling

5.8 match 401 stars 16.44 score 6.9k scripts 1.0k dependents

integrated-inferences

CausalQueries:Make, Update, and Query Binary Causal Models

Users can declare causal models over binary nodes, update beliefs about causal types given data, and calculate arbitrary queries. Updating is implemented in 'stan'. See Humphreys and Jacobs, 2023, Integrated Inferences (<DOI: 10.1017/9781316718636>) and Pearl, 2009 Causality (<DOI:10.1017/CBO9780511803161>).

Maintained by Till Tietz. Last updated 23 days ago.

bayes causal dags mixedmethods stan cpp

10.5 match 27 stars 9.03 score 54 scripts

dzmitrygb

Repliscope:Replication Timing Profiling using DNA Copy Number

Create, Plot and Compare Replication Timing Profiles. The method is described in Muller et al., (2014) <doi: 10.1093/nar/gkt878>.

Maintained by Dzmitry G Batrakou. Last updated 3 years ago.

30.2 match 3.13 score 27 scripts

ijaljuli

metarep:Replicability-Analysis Tools for Meta-Analysis

User-friendly package for reporting replicability-analysis methods, affixed to meta-analyses summary. The replicability-analysis output provides an assessment of the investigated intervention, where it offers quantification of effect replicability and assessment of the consistency of findings. - Replicability-analysis for fixed-effects and random-effect meta analysis: - r(u)-value; - lower bounds on the number of studies with replicated positive and\or negative effect; - Allows detecting inconsistency of signals; - forest plots with the summary of replicability analysis results; - Allows Replicability-analysis with or without the common-effect assumption.

Maintained by Iman Jaljuli. Last updated 1 years ago.

21.0 match 5 stars 4.40 score 4 scripts

mayamathur

Replicate:Statistical Metrics for Multisite Replication Studies

For a multisite replication project, computes the consistency metric P_orig, which is the probability that the original study would observe an estimated effect size as extreme or more extreme than it actually did, if in fact the original study were statistically consistent with the replications. Other recommended metrics are: (1) the probability of a true effect of scientifically meaningful size in the same direction as the estimate the original study; and (2) the probability of a true effect of meaningful size in the direction opposite the original study's estimate. These two can be computed using the package \code{MetaUtility::prop_stronger}. Additionally computes older metrics used in replication projects (namely expected agreement in "statistical significance" between an original study and replication studies as well as prediction intervals for the replication estimates). See Mathur and VanderWeele (under review; <https://osf.io/apnjk/>) for details.

Maintained by Maya B. Mathur. Last updated 5 years ago.

59.3 match 1.53 score 17 scripts

bioc

BindingSiteFinder:Binding site defintion based on iCLIP data

Precise knowledge on the binding sites of an RNA-binding protein (RBP) is key to understand (post-) transcriptional regulatory processes. Here we present a workflow that describes how exact binding sites can be defined from iCLIP data. The package provides functions for binding site definition and result visualization. For details please see the vignette.

Maintained by Mirko Brüggemann. Last updated 13 hours ago.

sequencing geneexpression generegulation functionalgenomics coverage dataimport binding-site-classification binding-sites bioconductor-package iclip rna-binding-proteins

15.6 match 6 stars 5.73 score 3 scripts

bioc

baySeq:Empirical Bayesian analysis of patterns of differential expression in count data

This package identifies differential expression in high-throughput 'count' data, such as that derived from next-generation sequencing machines, calculating estimated posterior likelihoods of differential expression (or more complex hypotheses) via empirical Bayesian methods.

Maintained by Samuel Granjeaud. Last updated 5 months ago.

sequencing differentialexpression multiplecomparison sage bayesian coverage

11.0 match 7.75 score 79 scripts 3 dependents

rsetienne

DAISIE:Dynamical Assembly of Islands by Speciation, Immigration and Extinction

Simulates and computes the (maximum) likelihood of a dynamical model of island biota assembly through speciation, immigration and extinction. See Valente et al. (2015) <doi:10.1111/ele.12461>.

Maintained by Rampal S. Etienne. Last updated 1 months ago.

fortran cpp

9.8 match 9 stars 8.59 score 55 scripts 1 dependents

dmmelamed

rioplot:Turn a Regression Model Inside Out

Turns regression models inside out. Functions decompose variances and coefficients for various regression model types. Functions also visualize regression model objects using techniques developed in Schoon, Melamed, and Breiger (2024) <doi:10.1017/9781108887205>.

Maintained by David Melamed. Last updated 4 months ago.

20.5 match 4.08 score 9 scripts

pythonhealthdatascience

treat.sim:Nelson's Treatment Centre Simulation in Simmer

A discrete-event simulation of a simple urgent care treatment centre simulation from Nelson (2013). Implemented in R Simmer. The model is packaged to allow for easy experimentation, summary of results, and implementation in other software such as a Shiny interface.

Maintained by Thomas Monks. Last updated 8 months ago.

computer-simulation discrete-event-simulation health open-modelling open-science open-source r-language reproducible-research simmer

18.3 match 2 stars 4.48 score 5 scripts

bioc

idr2d:Irreproducible Discovery Rate for Genomic Interactions Data

A tool to measure reproducibility between genomic experiments that produce two-dimensional peaks (interactions between peaks), such as ChIA-PET, HiChIP, and HiC. idr2d is an extension of the original idr package, which is intended for (one-dimensional) ChIP-seq peaks.

Maintained by Konstantin Krismer. Last updated 5 months ago.

dna3dstructure generegulation peakdetection epigenetics functionalgenomics classification hic

19.0 match 4.30 score 6 scripts

joachim-gassen

ExPanDaR:Explore Your Data Interactively

Provides a shiny-based front end (the 'ExPanD' app) and a set of functions for exploratory data analysis. Run as a web-based app, 'ExPanD' enables users to assess the robustness of empirical evidence without providing them access to the underlying data. You can export a notebook containing the analysis of 'ExPanD' and/or use the functions of the package to support your exploratory data analysis workflow. Refer to the vignettes of the package for more information on how to use 'ExPanD' and/or the functions of this package.

Maintained by Joachim Gassen. Last updated 4 years ago.

accounting eda exploratory-data-analysis finance open-science replication shiny shiny-apps

10.0 match 156 stars 7.80 score 203 scripts

calvagone

campsis:Generic PK/PD Simulation Platform CAMPSIS

A generic, easy-to-use and intuitive pharmacokinetic/pharmacodynamic (PK/PD) simulation platform based on R packages 'rxode2' and 'mrgsolve'. CAMPSIS provides an abstraction layer over the underlying processes of writing a PK/PD model, assembling a custom dataset and running a simulation. CAMPSIS has a strong dependency to the R package 'campsismod', which allows to read/write a model from/to files and adapt it further on the fly in the R environment. Package 'campsis' allows the user to assemble a dataset in an intuitive manner. Once the user’s dataset is ready, the package is in charge of preparing the simulation, calling 'rxode2' or 'mrgsolve' (at the user's choice) and returning the results, for the given model, dataset and desired simulation settings.

Maintained by Nicolas Luyckx. Last updated 1 months ago.

10.0 match 8 stars 7.52 score 93 scripts

becarioprecario

DCluster:Functions for the Detection of Spatial Clusters of Diseases

A set of functions for the detection of spatial clusters of disease using count data. Bootstrap is used to estimate sampling distributions of statistics.

Maintained by Virgilio Gómez-Rubio. Last updated 1 years ago.

15.9 match 4.47 score 99 scripts 1 dependents

rcalinjageman

esci:Estimation Statistics with Confidence Intervals

A collection of functions and 'jamovi' module for the estimation approach to inferential statistics, the approach which emphasizes effect sizes, interval estimates, and meta-analysis. Nearly all functions are based on 'statpsych' and 'metafor'. This package is still under active development, and breaking changes are likely, especially with the plot and hypothesis test functions. Data sets are included for all examples from Cumming & Calin-Jageman (2024) <ISBN:9780367531508>.

Maintained by Robert Calin-Jageman. Last updated 23 days ago.

jamovi jasp science statistics visualization

13.1 match 22 stars 5.42 score 12 scripts

weirichs

eatRep:Educational Assessment Tools for Replication Methods

Replication methods to compute some basic statistic operations (means, standard deviations, frequency tables, percentiles, mean comparisons using weighted effect coding, generalized linear models, and linear multilevel models) in complex survey designs comprising multiple imputed or nested imputed variables and/or a clustered sampling structure which both deserve special procedures at least in estimating standard errors. See the package documentation for a more detailed description along with references.

Maintained by Sebastian Weirich. Last updated 18 days ago.

13.5 match 1 stars 5.16 score 13 scripts

bioc

DESeq2:Differential gene expression analysis based on the negative binomial distribution

Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

Maintained by Michael Love. Last updated 12 days ago.

sequencing rnaseq chipseq geneexpression transcription normalization differentialexpression bayesian regression principalcomponent clustering immunooncology openblas cpp

4.3 match 375 stars 16.11 score 17k scripts 115 dependents

henrikbengtsson

R.utils:Various Programming Utilities

Utility functions useful when programming and developing R packages.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

4.9 match 63 stars 13.74 score 5.7k scripts 814 dependents

mboeck11

BGVAR:Bayesian Global Vector Autoregressions

Estimation of Bayesian Global Vector Autoregressions (BGVAR) with different prior setups and the possibility to introduce stochastic volatility. Built-in priors include the Minnesota, the stochastic search variable selection and Normal-Gamma (NG) prior. For a reference see also Crespo Cuaresma, J., Feldkircher, M. and F. Huber (2016) "Forecasting with Global Vector Autoregressive Models: a Bayesian Approach", Journal of Applied Econometrics, Vol. 31(7), pp. 1371-1391 <doi:10.1002/jae.2504>. Post-processing functions allow for doing predictions, structurally identify the model with short-run or sign-restrictions and compute impulse response functions, historical decompositions and forecast error variance decompositions. Plotting functions are also available. The package has a companion paper: Boeck, M., Feldkircher, M. and F. Huber (2022) "BGVAR: Bayesian Global Vector Autoregressions with Shrinkage Priors in R", Journal of Statistical Software, Vol. 104(9), pp. 1-28 <doi:10.18637/jss.v104.i09>.

Maintained by Maximilian Boeck. Last updated 3 months ago.

openblas cpp

8.6 match 27 stars 7.58 score 156 scripts

singmann

afex:Analysis of Factorial Experiments

Convenience functions for analyzing factorial experiments using ANOVA or mixed models. aov_ez(), aov_car(), and aov_4() allow specification of between, within (i.e., repeated-measures), or mixed (i.e., split-plot) ANOVAs for data in long format (i.e., one observation per row), automatically aggregating multiple observations per individual and cell of the design. mixed() fits mixed models using lme4::lmer() and computes p-values for all fixed effects using either Kenward-Roger or Satterthwaite approximation for degrees of freedom (LMM only), parametric bootstrap (LMMs and GLMMs), or likelihood ratio tests (LMMs and GLMMs). afex_plot() provides a high-level interface for interaction or one-way plots using ggplot2, combining raw data and model estimates. afex uses type 3 sums of squares as default (imitating commercial statistical software).

Maintained by Henrik Singmann. Last updated 7 months ago.

4.5 match 123 stars 14.50 score 1.4k scripts 15 dependents

bioc

fishpond:Fishpond: downstream methods and tools for expression data

Fishpond contains methods for differential transcript and gene expression analysis of RNA-seq data using inferential replicates for uncertainty of abundance quantification, as generated by Gibbs sampling or bootstrap sampling. Also the package contains a number of utilities for working with Salmon and Alevin quantification files.

Maintained by Michael Love. Last updated 5 months ago.

sequencing rnaseq geneexpression transcription normalization regression multiplecomparison batcheffect visualization differentialexpression differentialsplicing alternativesplicing singlecell bioconductor gene-expression genomics salmon scrnaseq statistics transcriptomics

7.7 match 28 stars 7.83 score 150 scripts

briencj

dae:Functions Useful in the Design and ANOVA of Experiments

The content falls into the following groupings: (i) Data, (ii) Factor manipulation functions, (iii) Design functions, (iv) ANOVA functions, (v) Matrix functions, (vi) Projector and canonical efficiency functions, and (vii) Miscellaneous functions. There is a vignette describing how to use the design functions for randomizing and assessing designs available as a vignette called 'DesignNotes'. The ANOVA functions facilitate the extraction of information when the 'Error' function has been used in the call to 'aov'. The package 'dae' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 4 months ago.

7.0 match 1 stars 8.62 score 356 scripts 7 dependents

fbartos

zcurve:An Implementation of Z-Curves

An implementation of z-curves - a method for estimating expected discovery and replicability rates on the bases of test-statistics of published studies. The package provides functions for fitting the new density and EM version (Bartoš & Schimmack, 2020, <doi:10.31234/osf.io/urgtn>), censored observations, as well as the original density z-curve (Brunner & Schimmack, 2020, <doi:10.15626/MP.2018.874>). Furthermore, the package provides summarizing and plotting functions for the fitted z-curve objects. See the aforementioned articles for more information about the z-curves, expected discovery and replicability rates, validation studies, and limitations.

Maintained by František Bartoš. Last updated 10 months ago.

edr err replicability z-cruve cpp

10.8 match 12 stars 5.48 score 21 scripts 1 dependents

statistikat

surveysd:Survey Standard Error Estimation for Cumulated Estimates and their Differences in Complex Panel Designs

Calculate point estimates and their standard errors in complex household surveys using bootstrap replicates. Bootstrapping considers survey design with a rotating panel. A comprehensive description of the methodology can be found under <https://statistikat.github.io/surveysd/articles/methodology.html>.

Maintained by Johannes Gussenbauer. Last updated 4 months ago.

bootstrap error-estimation survey cpp

8.4 match 9 stars 6.86 score 67 scripts

spatstat

spatstat:Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests

Comprehensive open-source toolbox for analysing Spatial Point Patterns. Focused mainly on two-dimensional point patterns, including multitype/marked points, in any spatial region. Also supports three-dimensional point patterns, space-time point patterns in any number of dimensions, point patterns on a linear network, and patterns of other geometrical objects. Supports spatial covariate data such as pixel images. Contains over 3000 functions for plotting spatial data, exploratory data analysis, model-fitting, simulation, spatial sampling, model diagnostics, and formal inference. Data types include point patterns, line segment patterns, spatial windows, pixel images, tessellations, and linear networks. Exploratory methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the functions ppm(), kppm(), slrm(), dppm() similar to glm(). Types of models include Poisson, Gibbs and Cox point processes, Neyman-Scott cluster processes, and determinantal point processes. Models may involve dependence on covariates, inter-point interaction, cluster formation and dependence on marks. Models are fitted by maximum likelihood, logistic regression, minimum contrast, and composite likelihood methods. A model can be fitted to a list of point patterns (replicated point pattern data) using the function mppm(). The model can include random effects and fixed effects depending on the experimental design, in addition to all the features listed above. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots.

Maintained by Adrian Baddeley. Last updated 2 months ago.

cluster-process cox-point-process gibbs-process kernel-density network-analysis point-process poisson-process spatial-analysis spatial-data spatial-data-analysis spatial-statistics spatstat statistical-methods statistical-models statistical-tests statistics

3.5 match 200 stars 16.32 score 5.5k scripts 41 dependents

bioc

compcodeR:RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

This package provides extensive functionality for comparing results obtained by different methods for differential expression analysis of RNAseq data. It also contains functions for simulating count data. Finally, it provides convenient interfaces to several packages for performing the differential expression analysis. These can also be used as templates for setting up and running a user-defined differential analysis workflow within the framework of the package.

Maintained by Charlotte Soneson. Last updated 3 months ago.

immunooncology rnaseq differentialexpression

6.8 match 11 stars 8.06 score 26 scripts

tmatta

lsasim:Functions to Facilitate the Simulation of Large Scale Assessment Data

Provides functions to simulate data from large-scale educational assessments, including background questionnaire data and cognitive item responses that adhere to a multiple-matrix sampled design. The theoretical foundation can be found on Matta, T.H., Rutkowski, L., Rutkowski, D. et al. (2018) <doi:10.1186/s40536-018-0068-8>.

Maintained by Waldir Leoncio. Last updated 2 months ago.

8.4 match 6 stars 6.41 score 18 scripts

prpatil

scifigure:Visualize 'Reproducibility' and 'Replicability' in a Comparison of Scientific Studies

Users may specify what fundamental qualities of a new study have or have not changed in an attempt to reproduce or replicate an original study. A comparison of the differences is visualized. Visualization approach follows 'Patil', 'Peng', and 'Leek' (2016) <doi:10.1101/066803>.

Maintained by Prasad Patil. Last updated 5 years ago.

10.1 match 5 stars 5.30 score 16 scripts

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 6 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

3.9 match 13.81 score 16k scripts 585 dependents

insightsengineering

chevron:Standard TLGs for Clinical Trials Reporting

Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.

Maintained by Joe Zhu. Last updated 25 days ago.

clinical-trials graphs listings nest reporting tables

6.2 match 12 stars 8.24 score 12 scripts

bioc

SparseSignatures:SparseSignatures

Point mutations occurring in a genome can be divided into 96 categories based on the base being mutated, the base it is mutated into and its two flanking bases. Therefore, for any patient, it is possible to represent all the point mutations occurring in that patient's tumor as a vector of length 96, where each element represents the count of mutations for a given category in the patient. A mutational signature represents the pattern of mutations produced by a mutagen or mutagenic process inside the cell. Each signature can also be represented by a vector of length 96, where each element represents the probability that this particular mutagenic process generates a mutation of the 96 above mentioned categories. In this R package, we provide a set of functions to extract and visualize the mutational signatures that best explain the mutation counts of a large number of patients.

Maintained by Luca De Sano. Last updated 5 months ago.

biomedicalinformatics somaticmutation

8.0 match 11 stars 6.42 score 4 scripts

bioc

r3Cseq:Analysis of Chromosome Conformation Capture and Next-generation Sequencing (3C-seq)

This package is used for the analysis of long-range chromatin interactions from 3C-seq assay.

Maintained by Supat Thongjuea. Last updated 5 months ago.

preprocessing sequencing

10.3 match 3 stars 4.85 score 17 scripts

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 6 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

3.7 match 13.40 score 17k scripts 255 dependents

conjugateprior

cbn:Tools and replication materials for Caliskan, Bryson, and Narayanan (2017)

This package allows users to replicate the analysis in the paper and also provides general purpose tools for working with a large word vector file and comparing groups of words with permutation statistics from the original paper. Alternative bootstrapped versions with confidence intervals are also available.

Maintained by Will Lowe. Last updated 6 years ago.

cpp

13.9 match 2 stars 3.48 score 6 scripts

dnychka

fields:Tools for Spatial Data

For curve, surface and function fitting with an emphasis on splines, spatial data, geostatistics, and spatial statistics. The major methods include cubic, and thin plate splines, Kriging, and compactly supported covariance functions for large data sets. The splines and Kriging methods are supported by functions that can determine the smoothing parameter (nugget and sill variance) and other covariance function parameters by cross validation and also by restricted maximum likelihood. For Kriging there is an easy to use function that also estimates the correlation scale (range parameter). A major feature is that any covariance function implemented in R and following a simple format can be used for spatial prediction. There are also many useful functions for plotting and working with spatial data as images. This package also contains an implementation of sparse matrix methods for large spatial data sets and currently requires the sparse matrix (spam) package. Use help(fields) to get started and for an overview. The fields source code is deliberately commented and provides useful explanations of numerical details as a companion to the manual pages. The commented source code can be viewed by expanding the source code version and looking in the R subdirectory. The reference for fields can be generated by the citation function in R and has DOI <doi:10.5065/D6W957CT>. Development of this package was supported in part by the National Science Foundation Grant 1417857, the National Center for Atmospheric Research, and Colorado School of Mines. See the Fields URL for a vignette on using this package and some background on spatial statistics.

Maintained by Douglas Nychka. Last updated 9 months ago.

fortran

3.8 match 15 stars 12.60 score 7.7k scripts 295 dependents

lrberge

fixest:Fast Fixed-Effects Estimations

Fast and user-friendly estimation of econometric models with multiple fixed-effects. Includes ordinary least squares (OLS), generalized linear models (GLM) and the negative binomial. The core of the package is based on optimized parallel C++ code, scaling especially well for large data sets. The method to obtain the fixed-effects coefficients is based on Berge (2018) <https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf>. Further provides tools to export and view the results of several estimations with intuitive design to cluster the standard-errors.

Maintained by Laurent Berge. Last updated 7 months ago.

cpp openmp

3.3 match 387 stars 14.69 score 3.8k scripts 25 dependents

wjbraun

DAAG:Data Analysis and Graphics Data and Functions

Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.

Maintained by W. John Braun. Last updated 11 months ago.

5.6 match 8.25 score 1.2k scripts 1 dependents

bayesiandemography

bage:Bayesian Estimation and Forecasting of Age-Specific Rates

Fast Bayesian estimation and forecasting of age-specific rates, probabilities, and means, based on 'Template Model Builder'.

Maintained by John Bryant. Last updated 2 months ago.

cpp

6.3 match 3 stars 7.30 score 39 scripts

pdhoff

amen:Additive and Multiplicative Effects Models for Networks and Relational Data

Analysis of dyadic network and relational data using additive and multiplicative effects (AME) models. The basic model includes regression terms, the covariance structure of the social relations model (Warner, Kenny and Stoto (1979) <DOI:10.1037/0022-3514.37.10.1742>, Wong (1982) <DOI:10.2307/2287296>), and multiplicative factor models (Hoff(2009) <DOI:10.1007/s10588-008-9040-4>). Several different link functions accommodate different relational data structures, including binary/network data, normal relational data, zero-inflated positive outcomes using a tobit model, ordinal relational data and data from fixed-rank nomination schemes. Several of these link functions are discussed in Hoff, Fosdick, Volfovsky and Stovel (2013) <DOI:10.1017/nws.2013.17>. Development of this software was supported in part by NIH grant R01HD067509.

Maintained by Peter Hoff. Last updated 4 years ago.

6.7 match 28 stars 6.81 score 153 scripts

yufree

enviGCMS:GC/LC-MS Data Analysis for Environmental Science

Gas/Liquid Chromatography-Mass Spectrometer(GC/LC-MS) Data Analysis for Environmental Science. This package covered topics such molecular isotope ratio, matrix effects and Short-Chain Chlorinated Paraffins analysis etc. in environmental analysis.

Maintained by Miao YU. Last updated 2 months ago.

environment mass-spectrometry metabolomics

7.1 match 17 stars 6.49 score 30 scripts 1 dependents

declaredesign

estimatr:Fast Estimators for Design-Based Inference

Fast procedures for small set of commonly-used, design-appropriate estimators with robust standard errors and confidence intervals. Includes estimators for linear regression, instrumental variables regression, difference-in-means, Horvitz-Thompson estimation, and regression improving precision of experimental estimates by interacting treatment with centered pre-treatment covariates introduced by Lin (2013) <doi:10.1214/12-AOAS583>.

Maintained by Graeme Blair. Last updated 1 months ago.

cpp

3.8 match 133 stars 11.58 score 1.7k scripts 11 dependents

bioc

TOAST:Tools for the analysis of heterogeneous tissues

This package is devoted to analyzing high-throughput data (e.g. gene expression microarray, DNA methylation microarray, RNA-seq) from complex tissues. Current functionalities include 1. detect cell-type specific or cross-cell type differential signals 2. tree-based differential analysis 3. improve variable selection in reference-free deconvolution 4. partial reference-free deconvolution with prior knowledge.

Maintained by Ziyi Li. Last updated 5 months ago.

dnamethylation geneexpression differentialexpression differentialmethylation microarray genetarget epigenetics methylationarray

5.4 match 11 stars 8.01 score 104 scripts 3 dependents

bioc

rifi:'rifi' analyses data from rifampicin time series created by microarray or RNAseq

'rifi' analyses data from rifampicin time series created by microarray or RNAseq. 'rifi' is a transcriptome data analysis tool for the holistic identification of transcription and decay associated processes. The decay constants and the delay of the onset of decay is fitted for each probe/bin. Subsequently, probes/bins of equal properties are combined into segments by dynamic programming, independent of a existing genome annotation. This allows to detect transcript segments of different stability or transcriptional events within one annotated gene. In addition to the classic decay constant/half-life analysis, 'rifi' detects processing sites, transcription pausing sites, internal transcription start sites in operons, sites of partial transcription termination in operons, identifies areas of likely transcriptional interference by the collision mechanism and gives an estimate of the transcription velocity. All data are integrated to give an estimate of continous transcriptional units, i.e. operons. Comprehensive output tables and visualizations of the full genome result and the individual fits for all probes/bins are produced.

Maintained by Jens Georg. Last updated 5 months ago.

rnaseq differentialexpression generegulation transcriptomics regression microarray software

9.4 match 4.60 score 1 scripts

bioc

HTSFilter:Filter replicated high-throughput transcriptome sequencing data

This package implements a filtering procedure for replicated transcriptome sequencing data based on a global Jaccard similarity index in order to identify genes with low, constant levels of expression across one or more experimental conditions.

Maintained by Andrea Rau. Last updated 5 months ago.

sequencing rnaseq preprocessing differentialexpression geneexpression normalization immunooncology

6.8 match 6.24 score 58 scripts 1 dependents

nandp1

gpbStat:Comprehensive Statistical Analysis of Plant Breeding Experiments

Performs statistical data analysis of various Plant Breeding experiments. Contains functions for Line by Tester analysis as per Arunachalam, V.(1974) <http://repository.ias.ac.in/89299/> and Diallel analysis as per Griffing, B. (1956) <https://www.publish.csiro.au/bi/pdf/BI9560463>.

Maintained by Nandan Patil. Last updated 4 months ago.

biometrics genetics plantbreeding

6.9 match 3 stars 6.08 score 27 scripts

bioc

mobileRNA:mobileRNA: Investigate the RNA mobilome & population-scale changes

Genomic analysis can be utilised to identify differences between RNA populations in two conditions, both in production and abundance. This includes the identification of RNAs produced by multiple genomes within a biological system. For example, RNA produced by pathogens within a host or mobile RNAs in plant graft systems. The mobileRNA package provides methods to pre-process, analyse and visualise the sRNA and mRNA populations based on the premise of mapping reads to all genotypes at the same time.

Maintained by Katie Jeynes-Cupper. Last updated 5 months ago.

visualization rnaseq sequencing smallrna genomeassembly clustering experimentaldesign qualitycontrol workflowstep alignment preprocessing bioinformatics plant-science

8.4 match 4 stars 5.00 score 2 scripts

rubensmoura87

MultiATSM:Multicountry Term Structure of Interest Rates Models

Estimation routines for several classes of affine term structure of interest rates models. All the models are based on the single-country unspanned macroeconomic risk framework from Joslin, Priebsch, and Singleton (2014, JF) <doi:10.1111/jofi.12131>. Multicountry extensions such as the ones of Jotikasthira, Le, and Lundblad (2015, JFE) <doi:10.1016/j.jfineco.2014.09.004>, Candelon and Moura (2023, EM) <doi:10.1016/j.econmod.2023.106453>, and Candelon and Moura (Forthcoming, JFEC) <doi:10.1093/jjfinec/nbae008> are also available.

Maintained by Rubens Moura. Last updated 6 days ago.

10.7 match 3.90 score 8 scripts

metrumresearchgroup

mrgsolve:Simulate from ODE-Based Models

Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.

Maintained by Kyle T Baron. Last updated 1 months ago.

mrgsolve ode openblas cpp

3.8 match 138 stars 10.90 score 1.2k scripts 3 dependents

bioc

puma:Propagating Uncertainty in Microarray Analysis(including Affymetrix tranditional 3' arrays and exon arrays and Human Transcriptome Array 2.0)

Most analyses of Affymetrix GeneChip data (including tranditional 3' arrays and exon arrays and Human Transcriptome Array 2.0) are based on point estimates of expression levels and ignore the uncertainty of such estimates. By propagating uncertainty to downstream analyses we can improve results from microarray analyses. For the first time, the puma package makes a suite of uncertainty propagation methods available to a general audience. In additon to calculte gene expression from Affymetrix 3' arrays, puma also provides methods to process exon arrays and produces gene and isoform expression for alternative splicing study. puma also offers improvements in terms of scope and speed of execution over previously available uncertainty propagation methods. Included are summarisation, differential expression detection, clustering and PCA methods, together with useful plotting functions.

Maintained by Xuejun Liu. Last updated 5 months ago.

microarray onechannel preprocessing differentialexpression clustering exonarray geneexpression mrnamicroarray chiponchip alternativesplicing differentialsplicing bayesian twochannel dataimport hta2.0

9.1 match 4.53 score 17 scripts

bioc

MetaNeighbor:Single cell replicability analysis

MetaNeighbor allows users to quantify cell type replicability across datasets using neighbor voting.

Maintained by Stephan Fischer. Last updated 5 months ago.

immunooncology geneexpression go multiplecomparison singlecell transcriptomics

7.0 match 5.89 score 78 scripts

arcaldwell49

SimplyAgree:Flexible and Robust Agreement and Reliability Analyses

Reliability and agreement analyses often have limited software support. Therefore, this package was created to make agreement and reliability analyses easier for the average researcher. The functions within this package include simple tests of agreement, agreement analysis for nested and replicate data, and provide robust analyses of reliability. In addition, this package contains a set of functions to help when planning studies looking to assess measurement agreement.

Maintained by Aaron Caldwell. Last updated 20 days ago.

6.2 match 10 stars 6.61 score 41 scripts

ropensci

tarchetypes:Archetypes for Targets

Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'targets' R package. by Will Landau (2018) <doi:10.21105/joss.00550>.

Maintained by William Michael Landau. Last updated 21 days ago.

data-science high-performance-computing peer-reviewed pipeline r-targetopia reproducibility targets workflow

3.6 match 141 stars 11.43 score 1.7k scripts 10 dependents

iqss

WhatIf:Software for Evaluating Counterfactuals

Inferences about counterfactuals are essential for prediction, answering what if questions, and estimating causal effects. However, when the counterfactuals posed are too far from the data at hand, conclusions drawn from well-specified statistical analyses become based largely on speculation hidden in convenient modeling assumptions that few would be willing to defend. Unfortunately, standard statistical approaches assume the veracity of the model rather than revealing the degree of model-dependence, which makes this problem hard to detect. WhatIf offers easy-to-apply methods to evaluate counterfactuals that do not require sensitivity testing over specified classes of models. If an analysis fails the tests offered here, then we know that substantive inferences will be sensitive to at least some modeling choices that are not based on empirical evidence, no matter what method of inference one chooses to use. WhatIf implements the methods for evaluating counterfactuals discussed in Gary King and Langche Zeng, 2006, "The Dangers of Extreme Counterfactuals," Political Analysis 14 (2) <DOI:10.1093/pan/mpj004>; and Gary King and Langche Zeng, 2007, "When Can History Be Our Guide? The Pitfalls of Counterfactual Inference," International Studies Quarterly 51 (March) <DOI:10.1111/j.1468-2478.2007.00445.x>.

Maintained by Soubhik Barari. Last updated 2 years ago.

7.0 match 18 stars 5.80 score 14 scripts

stan-dev

rstanarm:Bayesian Applied Regression Modeling via Stan

Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.

Maintained by Ben Goodrich. Last updated 9 months ago.

bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics multilevel-models rstan rstanarm stan statistical-modeling cpp

2.6 match 393 stars 15.68 score 5.0k scripts 13 dependents

bioc

musicatk:Mutational Signature Comprehensive Analysis Toolkit

Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.

Maintained by Joshua D. Campbell. Last updated 5 months ago.

software biologicalquestion somaticmutation variantannotation

5.8 match 13 stars 7.02 score 20 scripts

r-simmer

simmer:Discrete-Event Simulation for R

A process-oriented and trajectory-based Discrete-Event Simulation (DES) package for R. It is designed as a generic yet powerful framework. The architecture encloses a robust and fast simulation core written in 'C++' with automatic monitoring capabilities. It provides a rich and flexible R API that revolves around the concept of trajectory, a common path in the simulation model for entities of the same type. Documentation about 'simmer' is provided by several vignettes included in this package, via the paper by Ucar, Smeets & Azcorra (2019, <doi:10.18637/jss.v090.i02>), and the paper by Ucar, Hernández, Serrano & Azcorra (2018, <doi:10.1109/MCOM.2018.1700960>); see 'citation("simmer")' for details.

Maintained by Iñaki Ucar. Last updated 6 months ago.

discrete-event simulation cpp

3.5 match 223 stars 11.47 score 440 scripts 6 dependents

aiparragirre

svyVarSel:Variable Selection for Complex Survey Data

Fit design-based linear and logistic elastic nets with complex survey data considering the sampling design when defining training and test sets using replicate weights. Methods implemented in this package are described in: A. Iparragirre, T. Lumley, I. Barrio, I. Arostegui (2024) <doi:10.1002/sta4.578>.

Maintained by Amaia Iparragirre. Last updated 5 months ago.

complex-survey-data elastic-nets lasso replicate-weights variable-selection

12.5 match 3.18 score 1 dependents

rspatial

terra:Spatial Data Analysis

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

Maintained by Robert J. Hijmans. Last updated 4 hours ago.

geospatial raster spatial vector onetbb proj gdal geos cpp

2.3 match 559 stars 17.64 score 17k scripts 851 dependents

kwstat

agridat:Agricultural Datasets

Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.

Maintained by Kevin Wright. Last updated 28 days ago.

data

3.6 match 125 stars 11.02 score 1.7k scripts 2 dependents

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 5 hours ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

1.9 match 582 stars 21.11 score 31k scripts 1.9k dependents

barakbri

repfdr:Replicability Analysis for Multiple Studies of High Dimension

Estimation of Bayes and local Bayes false discovery rates for replicability analysis (Heller & Yekutieli, 2014 <doi:10.1214/13-AOAS697> ; Heller at al., 2015 <doi: 10.1093/bioinformatics/btu434>).

Maintained by Ruth Heller. Last updated 7 years ago.

cpp

7.9 match 3 stars 4.98 score 16 scripts

tidymodels

broom:Convert Statistical Objects into Tidy Tibbles

Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.

Maintained by Simon Couch. Last updated 4 months ago.

modeling tidy-data

1.8 match 1.5k stars 21.56 score 37k scripts 1.4k dependents

johanngb

ruv:Detect and Remove Unwanted Variation using Negative Controls

Implements the 'RUV' (Remove Unwanted Variation) algorithms. These algorithms attempt to adjust for systematic errors of unknown origin in high-dimensional data. The algorithms were originally developed for use with genomic data, especially microarray data, but may be useful with other types of high-dimensional data as well. These algorithms were proposed in Gagnon-Bartsch and Speed (2012) <doi:10.1093/nar/gkz433>, Gagnon-Bartsch, Jacob and Speed (2013), and Molania, et. al. (2019) <doi:10.1093/nar/gkz433>. The algorithms require the user to specify a set of negative control variables, as described in the references. The algorithms included in this package are 'RUV-2', 'RUV-4', 'RUV-inv', 'RUV-rinv', 'RUV-I', and RUV-III', along with various supporting algorithms.

Maintained by Johann Gagnon-Bartsch. Last updated 6 years ago.

8.8 match 2 stars 4.36 score 94 scripts 7 dependents

oobianom

quickcode:Quick and Essential 'R' Tricks for Better Scripts

The NOT functions, 'R' tricks and a compilation of some simple quick plus often used 'R' codes to improve your scripts. Improve the quality and reproducibility of 'R' scripts.

Maintained by Obinna Obianom. Last updated 14 days ago.

colors data distributions images

4.8 match 5 stars 7.76 score 7 scripts 6 dependents

bioc

RESOLVE:RESOLVE: An R package for the efficient analysis of mutational signatures from cancer genomes

Cancer is a genetic disease caused by somatic mutations in genes controlling key biological functions such as cellular growth and division. Such mutations may arise both through cell-intrinsic and exogenous processes, generating characteristic mutational patterns over the genome named mutational signatures. The study of mutational signatures have become a standard component of modern genomics studies, since it can reveal which (environmental and endogenous) mutagenic processes are active in a tumor, and may highlight markers for therapeutic response. Mutational signatures computational analysis presents many pitfalls. First, the task of determining the number of signatures is very complex and depends on heuristics. Second, several signatures have no clear etiology, casting doubt on them being computational artifacts rather than due to mutagenic processes. Last, approaches for signatures assignment are greatly influenced by the set of signatures used for the analysis. To overcome these limitations, we developed RESOLVE (Robust EStimation Of mutationaL signatures Via rEgularization), a framework that allows the efficient extraction and assignment of mutational signatures. RESOLVE implements a novel algorithm that enables (i) the efficient extraction, (ii) exposure estimation, and (iii) confidence assessment during the computational inference of mutational signatures.

Maintained by Luca De Sano. Last updated 5 months ago.

biomedicalinformatics somaticmutation

8.0 match 1 stars 4.60 score 3 scripts

bioc

scMerge:scMerge: Merging multiple batches of scRNA-seq data

Like all gene expression data, single-cell data suffers from batch effects and other unwanted variations that makes accurate biological interpretations difficult. The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple single-cell data. This package contains all the necessary functions in the scMerge pipeline, including the identification of SEGs, replication-identification methods, and merging of single-cell data.

Maintained by Yingxin Lin. Last updated 5 months ago.

batcheffect geneexpression normalization rnaseq sequencing singlecell software transcriptomics bioinformatics single-cell

3.8 match 67 stars 9.52 score 137 scripts 1 dependents

bioc

synergyfinder:Calculate and Visualize Synergy Scores for Drug Combinations

Efficient implementations for analyzing pre-clinical multiple drug combination datasets. It provides efficient implementations for 1.the popular synergy scoring models, including HSA, Loewe, Bliss, and ZIP to quantify the degree of drug combination synergy; 2. higher order drug combination data analysis and synergy landscape visualization for unlimited number of drugs in a combination; 3. statistical analysis of drug combination synergy and sensitivity with confidence intervals and p-values; 4. synergy barometer for harmonizing multiple synergy scoring methods to provide a consensus metric of synergy; 5. evaluation of synergy and sensitivity simultaneously to provide an unbiased interpretation of the clinical potential of the drug combinations. Based on this package, we also provide a web application (http://www.synergyfinder.org) for users who prefer graphical user interface.

Maintained by Shuyu Zheng. Last updated 5 months ago.

software statisticalmethod

6.5 match 5.42 score 44 scripts

amrei-stammann

alpaca:Fit GLM's with High-Dimensional k-Way Fixed Effects

Provides a routine to partial out factors with many levels during the optimization of the log-likelihood function of the corresponding generalized linear model (glm). The package is based on the algorithm described in Stammann (2018) <arXiv:1707.01815> and is restricted to glm's that are based on maximum likelihood estimation and nonlinear. It also offers an efficient algorithm to recover estimates of the fixed effects in a post-estimation routine and includes robust and multi-way clustered standard errors. Further the package provides analytical bias corrections for binary choice models derived by Fernandez-Val and Weidner (2016) <doi:10.1016/j.jeconom.2015.12.014> and Hinz, Stammann, and Wanner (2020) <arXiv:2004.12655>.

Maintained by Amrei Stammann. Last updated 6 months ago.

openblas cpp

5.0 match 45 stars 7.01 score 105 scripts

bioc

tximport:Import and summarize transcript-level estimates for transcript- and gene-level analysis

Imports transcript-level abundance, estimated counts and transcript lengths, and summarizes into matrices for use with downstream gene-level analysis packages. Average transcript length, weighted by sample-specific transcript abundance estimates, is provided as a matrix which can be used as an offset for different expression of gene-level counts.

Maintained by Michael Love. Last updated 5 months ago.

dataimport preprocessing rnaseq transcriptomics transcription geneexpression immunooncology bioconductor deseq2

2.7 match 137 stars 12.95 score 2.6k scripts 11 dependents

bioc

CexoR:An R package to uncover high-resolution protein-DNA interactions in ChIP-exo replicates

Strand specific peak-pair calling in ChIP-exo replicates. The cumulative Skellam distribution function is used to detect significant normalised count differences of opposed sign at each DNA strand (peak-pairs). Then, irreproducible discovery rate for overlapping peak-pairs across biological replicates is computed.

Maintained by Pedro Madrigal. Last updated 5 months ago.

functionalgenomics sequencing coverage chipseq peakdetection

8.6 match 4.00 score 1 scripts

mirzaghaderi

rtpcr:qPCR Data Analysis

Various methods are employed for statistical analysis and graphical presentation of real-time PCR (quantitative PCR or qPCR) data. 'rtpcr' handles amplification efficiency calculation, statistical analysis and graphical representation of real-time PCR data based on up to two reference genes. By accounting for amplification efficiency values, 'rtpcr' was developed using a general calculation method described by Ganger et al. (2017) <doi:10.1186/s12859-017-1949-5> and Taylor et al. (2019) <doi:10.1016/j.tibtech.2018.12.002>, covering both the Livak and Pfaffl methods. Based on the experimental conditions, the functions of the 'rtpcr' package use t-test (for experiments with a two-level factor), analysis of variance (ANOVA), analysis of covariance (ANCOVA) or analysis of repeated measure data to calculate the fold change (FC, Delta Delta Ct method) or relative expression (RE, Delta Ct method). The functions further provide standard errors and confidence intervals for means, apply statistical mean comparisons and present significance. To facilitate function application, different data sets were used as examples and the outputs were explained. ‘rtpcr’ package also provides bar plots using various controlling arguments. The 'rtpcr' package is user-friendly and easy to work with and provides an applicable resource for analyzing real-time PCR data.

Maintained by Ghader Mirzaghaderi. Last updated 26 days ago.

data-analysis qpcr

7.0 match 1 stars 4.88 score 3 scripts

samch93

BayesRep:Bayesian Analysis of Replication Studies

Provides tools for the analysis of replication studies using Bayes factors (Pawel and Held, 2022) <doi:10.1111/rssb.12491>.

Maintained by Samuel Pawel. Last updated 1 years ago.

12.6 match 2.70 score 5 scripts

rmi-pacta

pacta.loanbook:Easily Install and Load PACTA for Banks Packages

PACTA (Paris Agreement Capital Transition Assessment) for Banks is a tool that allows banks to calculate the climate alignment of their corporate lending portfolios. This package is designed to make it easy to install and load multiple PACTA for Banks packages in a single step. It also provides thorough documentation - the PACTA for Banks cookbook at <https://rmi-pacta.github.io/pacta.loanbook/articles/cookbook_overview.html> - on how to run a PACTA for Banks analysis. This covers prerequisites for the analysis, the separate steps of running the analysis, the interpretation of PACTA for Banks results, and advanced use cases.

Maintained by Jacob Kastl. Last updated 3 days ago.

7.2 match 1 stars 4.68 score 12 scripts

bnowok

synthpop:Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control

A tool for producing synthetic versions of microdata containing confidential information so that they are safe to be released to users for exploratory analysis. The key objective of generating synthetic data is to replace sensitive original values with synthetic ones causing minimal distortion of the statistical information contained in the data set. Variables, which can be categorical or continuous, are synthesised one-by-one using sequential modelling. Replacements are generated by drawing from conditional distributions fitted to the original data using parametric or classification and regression trees models. Data are synthesised via the function syn() which can be largely automated, if default settings are used, or with methods defined by the user. Optional parameters can be used to influence the disclosure risk and the analytical quality of the synthesised data. For a description of the implemented method see Nowok, Raab and Dibben (2016) <doi:10.18637/jss.v074.i11>.

Maintained by Beata Nowok. Last updated 3 years ago.

4.3 match 44 stars 7.85 score 536 scripts

tdjorgensen

simsem:SIMulated Structural Equation Modeling

Provides an easy framework for Monte Carlo simulation in structural equation modeling, which can be used for various purposes, such as such as model fit evaluation, power analysis, or missing data handling and planning.

Maintained by Terrence D. Jorgensen. Last updated 4 years ago.

9.7 match 3.40 score 276 scripts

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 3 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

2.5 match 109 stars 13.20 score 342 scripts 3 dependents

r-forge

Matrix:Sparse and Dense Matrix Classes and Methods

A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.

Maintained by Martin Maechler. Last updated 7 days ago.

openblas

1.9 match 1 stars 17.23 score 33k scripts 12k dependents

sooahnshin

aihuman:Experimental Evaluation of Algorithm-Assisted Human Decision-Making

Provides statistical methods for analyzing experimental evaluation of the causal impacts of algorithmic recommendations on human decisions developed by Imai, Jiang, Greiner, Halen, and Shin (2023) <doi:10.1093/jrsssa/qnad010> and Ben-Michael, Greiner, Huang, Imai, Jiang, and Shin (2024) <doi:10.48550/arXiv.2403.12108>. The data used for this paper, and made available here, are interim, based on only half of the observations in the study and (for those observations) only half of the study follow-up period. We use them only to illustrate methods, not to draw substantive conclusions.

Maintained by Sooahn Shin. Last updated 3 months ago.

openblas cpp openmp

7.0 match 2 stars 4.60 score 8 scripts

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

4.4 match 7.27 score 251 scripts 1 dependents

bioc

NOISeq:Exploratory analysis and differential expression for RNA-seq data

Analysis of RNA-seq expression data or other similar kind of data. Exploratory plots to evualuate saturation, count distribution, expression per chromosome, type of detected features, features length, etc. Differential expression between two experimental conditions with no parametric assumptions.

Maintained by Sonia Tarazona. Last updated 5 months ago.

immunooncology rnaseq differentialexpression visualization sequencing

4.8 match 6.70 score 207 scripts 4 dependents

martin3141

spant:MR Spectroscopy Analysis Tools

Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.

Maintained by Martin Wilson. Last updated 1 months ago.

brain mri mrs mrshub spectroscopy fortran

3.8 match 25 stars 8.52 score 81 scripts

piklprado

Rsampling:Ports the Workflow of "Resampling Stats" Add-in to R

Resampling Stats (http://www.resample.com) is an add-in for running randomization tests in Excel worksheets. The workflow is (1) to define a statistic of interest that can be calculated from a data table, (2) to randomize rows ad/or columns of a data table to simulate a null hypothesis and (3) and to score the value of the statistic from many randomizations. The relative frequency distribution of the statistic in the simulations is then used to infer the probability of the observed value be generated by the null process (probability of Type I error). This package intends to translate this logic for R for teaching purposes. Keeping the original workflow is favored over performance.

Maintained by Paulo Prado. Last updated 9 years ago.

6.1 match 5.11 score 16 scripts

herulor

DFIT:Differential Functioning of Items and Tests

A set of functions to perform Raju, van der Linden and Fleer's (1995, <doi:10.1177/014662169501900405>) Differential Functioning of Items and Tests (DFIT) analyses. It includes functions to use the Monte Carlo Item Parameter Replication approach (Oshima, Raju, & Nanda, 2006, <doi:10.1111/j.1745-3984.2006.00001.x>) for obtaining the associated statistical significance tests cut-off points. They may also be used for a priori and post-hoc power calculations (Cervantes, 2017, <doi:10.18637/jss.v076.i05>).

Maintained by Victor H. Cervantes. Last updated 9 months ago.

13.6 match 2.30 score 20 scripts

graemeblair

rdss:Companion Datasets and Functions for Research Design in the Social Sciences

Helper functions to accompany the Blair, Coppock, and Humphreys (2022) "Research Design in the Social Sciences: Declaration, Diagnosis, and Redesign" <https://book.declaredesign.org>. 'rdss' includes datasets, helper functions, and plotting components to enable use and replication of the book.

Maintained by Graeme Blair. Last updated 2 months ago.

11.6 match 2.64 score 29 scripts

kosukeimai

RCT2:Designing and Analyzing Two-Stage Randomized Experiments

Provides various statistical methods for designing and analyzing two-stage randomized controlled trials using the methods developed by Imai, Jiang, and Malani (2021) <doi:10.1080/01621459.2020.1775612> and (2022+) <doi:10.48550/arXiv.2011.07677>. The package enables the estimation of direct and spillover effects, conduct hypotheses tests, and conduct sample size calculation for two-stage randomized controlled trials.

Maintained by Kosuke Imai. Last updated 2 years ago.

6.5 match 5 stars 4.70 score 4 scripts

bioc

esATAC:An Easy-to-use Systematic pipeline for ATACseq data analysis

This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw sequencing reads preprocessing (FASTQ files), reads alignment (Rbowtie2), aligned reads file operations (SAM, BAM, and BED files), peak calling (F-seq), genome annotations (Motif, GO, SNP analysis) and quality control report. The package is managed by dataflow graph. It is easy for user to pass variables seamlessly between processes and understand the workflow. Users can process FASTQ files through end-to-end preset pipeline which produces a pretty HTML report for quality control and preliminary statistical results, or customize workflow starting from any intermediate stages with esATAC functions easily and flexibly.

Maintained by Zheng Wei. Last updated 5 months ago.

immunooncology sequencing dnaseq qualitycontrol alignment preprocessing coverage atacseq dnaseseq atac-seq bioconductor pipeline cpp openjdk

5.0 match 23 stars 6.11 score 3 scripts

asgr

imager:Image Processing Library Based on 'CImg'

Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.

Maintained by Aaron Robotham. Last updated 27 days ago.

libx11 fftw3 tiff cpp openmp

2.3 match 17 stars 13.62 score 2.4k scripts 45 dependents

bioc

GBScleanR:Error correction tool for noisy genotyping by sequencing (GBS) data

GBScleanR is a package for quality check, filtering, and error correction of genotype data derived from next generation sequcener (NGS) based genotyping platforms. GBScleanR takes Variant Call Format (VCF) file as input. The main function of this package is `estGeno()` which estimates the true genotypes of samples from given read counts for genotype markers using a hidden Markov model with incorporating uneven observation ratio of allelic reads. This implementation gives robust genotype estimation even in noisy genotype data usually observed in Genotyping-By-Sequnencing (GBS) and similar methods, e.g. RADseq. The current implementation accepts genotype data of a diploid population at any generation of multi-parental cross, e.g. biparental F2 from inbred parents, biparental F2 from outbred parents, and 8-way recombinant inbred lines (8-way RILs) which can be refered to as MAGIC population.

Maintained by Tomoyuki Furuta. Last updated 3 days ago.

geneticvariability snp genetics hiddenmarkovmodel sequencing qualitycontrol cpp

5.1 match 4 stars 5.90 score 6 scripts

lebebr01

simglm:Simulate Models Based on the Generalized Linear Model

Simulates regression models, including both simple regression and generalized linear mixed models with up to three level of nesting. Power simulations that are flexible allowing the specification of missing data, unbalanced designs, and different random error distributions are built into the package.

Maintained by Brandon LeBeau. Last updated 10 months ago.

power simulation

3.8 match 43 stars 7.87 score 87 scripts

a-dudek-ue

clusterSim:Searching for Optimal Clustering Procedure for a Data Set

Distance measures (GDM1, GDM2, Sokal-Michener, Bray-Curtis, for symbolic interval-valued data), cluster quality indices (Calinski-Harabasz, Baker-Hubert, Hubert-Levine, Silhouette, Krzanowski-Lai, Hartigan, Gap, Davies-Bouldin), data normalization formulas (metric data, interval-valued symbolic data), data generation (typical and non-typical data), HINoV method, replication analysis, linear ordering methods, spectral clustering, agreement indices between two partitions, plot functions (for categorical and symbolic interval-valued data). (MILLIGAN, G.W., COOPER, M.C. (1985) <doi:10.1007/BF02294245>, HUBERT, L., ARABIE, P. (1985) <doi:10.1007%2FBF01908075>, RAND, W.M. (1971) <doi:10.1080/01621459.1971.10482356>, JAJUGA, K., WALESIAK, M. (2000) <doi:10.1007/978-3-642-57280-7_11>, MILLIGAN, G.W., COOPER, M.C. (1988) <doi:10.1007/BF01897163>, JAJUGA, K., WALESIAK, M., BAK, A. (2003) <doi:10.1007/978-3-642-55721-7_12>, DAVIES, D.L., BOULDIN, D.W. (1979) <doi:10.1109/TPAMI.1979.4766909>, CALINSKI, T., HARABASZ, J. (1974) <doi:10.1080/03610927408827101>, HUBERT, L. (1974) <doi:10.1080/01621459.1974.10480191>, TIBSHIRANI, R., WALTHER, G., HASTIE, T. (2001) <doi:10.1111/1467-9868.00293>, BRECKENRIDGE, J.N. (2000) <doi:10.1207/S15327906MBR3502_5>, WALESIAK, M., DUDEK, A. (2008) <doi:10.1007/978-3-540-78246-9_11>).

Maintained by Andrzej Dudek. Last updated 6 months ago.

cpp

4.6 match 2 stars 6.35 score 512 scripts 9 dependents

r-lib

bit:Classes and Methods for Fast Memory-Efficient Boolean Selections

Provided are classes for boolean and skewed boolean vectors, fast boolean methods, fast unique and non-unique integer sorting, fast set operations on sorted and unsorted sets of integers, and foundations for ff (range index, compression, chunked processing).

Maintained by Michael Chirico. Last updated 6 days ago.

1.9 match 12 stars 15.15 score 131 scripts 3.2k dependents

syedhaider5

chicane:Capture Hi-C Analysis Engine

Toolkit for processing and calling interactions in capture Hi-C data. Converts BAM files into counts of reads linking restriction fragments, and identifies pairs of fragments that interact more than expected by chance. Significant interactions are identified by comparing the observed read count to the expected background rate from a count regression model.

Maintained by Syed Haider. Last updated 3 years ago.

10.3 match 2.75 score 28 scripts

bioc

Dune:Improving replicability in single-cell RNA-Seq cell type discovery

Given a set of clustering labels, Dune merges pairs of clusters to increase mean ARI between labels, improving replicability.

Maintained by Hector Roux de Bezieux. Last updated 5 months ago.

clustering geneexpression rnaseq software singlecell transcriptomics visualization

6.1 match 4.61 score 41 scripts

bioc

cummeRbund:Analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data.

Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.

Maintained by Loyal A. Goff. Last updated 5 months ago.

highthroughputsequencing highthroughputsequencingdata rnaseq rnaseqdata geneexpression differentialexpression infrastructure dataimport datarepresentation visualization bioinformatics clustering multiplecomparisons qualitycontrol

4.8 match 5.92 score 209 scripts

cloudyr

aws.s3:'AWS S3' Client Package

A simple client package for the Amazon Web Services ('AWS') Simple Storage Service ('S3') 'REST' 'API' <https://aws.amazon.com/s3/>.

Maintained by Simon Urbanek. Last updated 5 years ago.

amazon aws aws-s3 cloudyr s3 s3-storage

2.3 match 383 stars 12.47 score 1.4k scripts 17 dependents

svkucheryavski

mdatools:Multivariate Data Analysis for Chemometrics

Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.

Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.

3.8 match 35 stars 7.37 score 220 scripts 1 dependents

r-lib

bit64:A S3 Class for Vectors of 64bit Integers

Package 'bit64' provides serializable S3 atomic 64bit (signed) integers. These are useful for handling database keys and exact counting in +-2^63. WARNING: do not use them as replacement for 32bit integers, integer64 are not supported for subscripting by R-core and they have different semantics when combined with double, e.g. integer64 + double => integer64. Class integer64 can be used in vectors, matrices, arrays and data.frames. Methods are available for coercion from and to logicals, integers, doubles, characters and factors as well as many elementwise and summary functions. Many fast algorithmic operations such as 'match' and 'order' support inter- active data exploration and manipulation and optionally leverage caching.

Maintained by Michael Chirico. Last updated 4 days ago.

1.9 match 35 stars 14.91 score 1.5k scripts 3.2k dependents

bioc

RUVSeq:Remove Unwanted Variation from RNA-Seq Data

This package implements the remove unwanted variation (RUV) methods of Risso et al. (2014) for the normalization of RNA-Seq read counts between samples.

Maintained by Davide Risso. Last updated 5 months ago.

immunooncology differentialexpression preprocessing rnaseq software

2.8 match 13 stars 9.90 score 482 scripts 5 dependents

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

2.3 match 29 stars 12.34 score 6.6k scripts 931 dependents

jranke

mkin:Kinetic Evaluation of Chemical Degradation Data

Calculation routines based on the FOCUS Kinetics Report (2006, 2014). Includes a function for conveniently defining differential equation models, model solution based on eigenvalues if possible or using numerical solvers. If a C compiler (on windows: 'Rtools') is installed, differential equation models are solved using automatically generated C functions. Non-constant errors can be taken into account using variance by variable or two-component error models <doi:10.3390/environments6120124>. Hierarchical degradation models can be fitted using nonlinear mixed-effects model packages as a back end <doi:10.3390/environments8080071>. Please note that no warranty is implied for correctness of results or fitness for a particular purpose.

Maintained by Johannes Ranke. Last updated 1 months ago.

degradation focus-kinetics kinetic-models kinetics ode ode-model

3.4 match 11 stars 8.18 score 78 scripts 1 dependents

kollerma

robustlmm:Robust Linear Mixed Effects Models

Implements the Robust Scoring Equations estimator to fit linear mixed effects models robustly. Robustness is achieved by modification of the scoring equations combined with the Design Adaptive Scale approach.

Maintained by Manuel Koller. Last updated 1 years ago.

openblas cpp

3.1 match 28 stars 8.79 score 138 scripts

bioc

RnBeads:RnBeads

RnBeads facilitates comprehensive analysis of various types of DNA methylation data at the genome scale.

Maintained by Fabian Mueller. Last updated 1 months ago.

dnamethylation methylationarray methylseq epigenetics qualitycontrol preprocessing batcheffect differentialmethylation sequencing cpgisland immunooncology twochannel dataimport

4.0 match 6.85 score 169 scripts 1 dependents

sritchie73

NetRep:Permutation Testing Network Module Preservation Across Datasets

Functions for assessing the replication/preservation of a network module's topology across datasets through permutation testing; Ritchie et al. (2015) <doi: 10.1016/j.cels.2016.06.012>.

Maintained by Scott Ritchie. Last updated 4 years ago.

openblas cpp

4.0 match 12 stars 6.84 score 16 scripts 3 dependents

spatstat

spatstat.geom:Geometrical Functionality of the 'spatstat' Family

Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)

Maintained by Adrian Baddeley. Last updated 0 hours ago.

classes-and-objects distance-calculation geometry geometry-processing images mensuration plotting point-patterns spatial-data spatial-data-analysis

2.3 match 7 stars 12.11 score 241 scripts 227 dependents

truecluster

ff:Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Maintained by Jens Oehlschlägel. Last updated 2 months ago.

cpp

2.3 match 27 stars 12.01 score 764 scripts 71 dependents

critical-infrastructure-systems-lab

ldsr:Linear Dynamical System Reconstruction

Streamflow (and climate) reconstruction using Linear Dynamical Systems. The advantage of this method is the additional state trajectory which can reveal more information about the catchment or climate system. For details of the method please refer to Nguyen and Galelli (2018) <doi:10.1002/2017WR022114>.

Maintained by Hung Nguyen. Last updated 5 years ago.

expectation-maximization-algorithm hydrology kalman-smoother linear-dynamical-systems paleoclimate openblas cpp openmp

5.5 match 8 stars 4.86 score 18 scripts

bioc

pcaMethods:A collection of PCA methods

Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.

Maintained by Henning Redestig. Last updated 5 months ago.

bayesian cpp

2.0 match 49 stars 13.10 score 538 scripts 73 dependents

liamrevell

phytools:Phylogenetic Tools for Comparative Biology (and Other Things)

A wide range of methods for phylogenetic analysis - concentrated in phylogenetic comparative biology, but also including numerous techniques for visualizing, analyzing, manipulating, reading or writing, and even inferring phylogenetic trees. Included among the functions in phylogenetic comparative biology are various for ancestral state reconstruction, model-fitting, and simulation of phylogenies and trait data. A broad range of plotting methods for phylogenies and comparative data include (but are not restricted to) methods for mapping trait evolution on trees, for projecting trees into phenotype space or a onto a geographic map, and for visualizing correlated speciation between trees. Lastly, numerous functions are designed for reading, writing, analyzing, inferring, simulating, and manipulating phylogenetic trees and comparative data. For instance, there are functions for computing consensus phylogenies from a set, for simulating phylogenetic trees and data under a range of models, for randomly or non-randomly attaching species or clades to a tree, as well as for a wide range of other manipulations and analyses that phylogenetic biologists might find useful in their research.

Maintained by Liam J. Revell. Last updated 28 days ago.

1.9 match 218 stars 13.85 score 4.8k scripts 76 dependents

mbinois

hetGP:Heteroskedastic Gaussian Process Modeling and Design under Replication

Performs Gaussian process regression with heteroskedastic noise following the model by Binois, M., Gramacy, R., Ludkovski, M. (2016) <doi:10.48550/arXiv.1611.05902>, with implementation details in Binois, M. & Gramacy, R. B. (2021) <doi:10.18637/jss.v098.i13>. The input dependent noise is modeled as another Gaussian process. Replicated observations are encouraged as they yield computational savings. Sequential design procedures based on the integrated mean square prediction error and lookahead heuristics are provided, and notably fast update functions when adding new observations.

Maintained by Mickael Binois. Last updated 6 months ago.

cpp

5.3 match 5 stars 4.89 score 260 scripts 2 dependents

wernerstahel

relevance:Calculate Relevance and Significance Measures

Calculates relevance and significance values for simple models and for many types of regression models. These are introduced in 'Stahel, Werner A.' (2021) "Measuring Significance and Relevance instead of p-values." <https://stat.ethz.ch/~stahel/relevance/stahel-relevance2103.pdf>. These notions are also applied to replication studies, as described in the manuscript 'Stahel, Werner A.' (2022) "'Replicability': Terminology, Measuring Success, and Strategy" available in the documentation.

Maintained by Werner A. Stahel. Last updated 1 years ago.

13.0 match 2.00 score 3 scripts

mjlajeunesse

metagear:Comprehensive Research Synthesis Tools for Systematic Reviews and Meta-Analysis

Functionalities for facilitating systematic reviews, data extractions, and meta-analyses. It includes a GUI (graphical user interface) to help screen the abstracts and titles of bibliographic data; tools to assign screening effort across multiple collaborators/reviewers and to assess inter- reviewer reliability; tools to help automate the download and retrieval of journal PDF articles from online databases; figure and image extractions from PDFs; web scraping of citations; automated and manual data extraction from scatter-plot and bar-plot images; PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagrams; simple imputation tools to fill gaps in incomplete or missing study parameters; generation of random effects sizes for Hedges' d, log response ratio, odds ratio, and correlation coefficients for Monte Carlo experiments; covariance equations for modelling dependencies among multiple effect sizes (e.g., effect sizes with a common control); and finally summaries that replicate analyses and outputs from widely used but no longer updated meta-analysis software (i.e., metawin). Funding for this package was supported by National Science Foundation (NSF) grants DBI-1262545 and DEB-1451031. CITE: Lajeunesse, M.J. (2016) Facilitating systematic reviews, data extraction and meta-analysis with the metagear package for R. Methods in Ecology and Evolution 7, 323-330 <doi:10.1111/2041-210X.12472>.

Maintained by Marc J. Lajeunesse. Last updated 4 years ago.

3.9 match 14 stars 6.71 score 91 scripts

easystats

datawizard:Easy Data Wrangling and Statistical Transformations

A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.

Maintained by Etienne Bacher. Last updated 10 days ago.

data dplyr hacktoberfest janitor manipulation reshape tidyr wrangling

1.8 match 222 stars 14.71 score 436 scripts 119 dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure bioconductor-package core-package

1.8 match 12 stars 14.22 score 612 scripts 2.2k dependents

kmkuesters

pooledpeaks:Genetic Analysis of Pooled Samples

Analyzing genetic data obtained from pooled samples. This package can read in Fragment Analysis output files, process the data, and score peaks, as well as facilitate various analyses, including cluster analysis, calculation of genetic distances and diversity indices, as well as bootstrap resampling for statistical inference. Specifically tailored to handle genetic data efficiently, researchers can explore population structure, genetic differentiation, and genetic relatedness among samples. We updated some functions from Covarrubias-Pazaran et al. (2016) <doi:10.1186/s12863-016-0365-6> to allow for the use of new file formats and referenced the following to write our genetic analysis functions: Long et al. (2022) <doi:10.1038/s41598-022-04776-0>, Jost (2008) <doi:10.1111/j.1365-294x.2008.03887.x>, Nei (1973) <doi:10.1073/pnas.70.12.3321>, Foulley et al. (2006) <doi:10.1016/j.livprodsci.2005.10.021>, Chao et al. (2008) <doi:10.1111/j.1541-0420.2008.01010.x>.

Maintained by Kathleen Kuesters. Last updated 2 days ago.

5.3 match 1 stars 4.85 score 3 scripts

lindanab

mecor:Measurement Error Correction in Linear Models with a Continuous Outcome

Covariate measurement error correction is implemented by means of regression calibration by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331), efficient regression calibration by Spiegelman D, Carroll RJ & Kipnis V (2001) <doi:10.1002/1097-0258(20010115)20:1%3C139::AID-SIM644%3E3.0.CO;2-K> and maximum likelihood estimation by Bartlett JW, Stavola DBL & Frost C (2009) <doi:10.1002/sim.3713>. Outcome measurement error correction is implemented by means of the method of moments by Buonaccorsi JP (2010, ISBN:1420066560) and efficient method of moments by Keogh RH, Carroll RJ, Tooze JA, Kirkpatrick SI & Freedman LS (2014) <doi:10.1002/sim.7011>. Standard error estimation of the corrected estimators is implemented by means of the Delta method by Rosner B, Spiegelman D & Willett WC (1990) <doi:10.1093/oxfordjournals.aje.a115715> and Rosner B, Spiegelman D & Willett WC (1992) <doi:10.1093/oxfordjournals.aje.a116453>, the Fieller method described by Buonaccorsi JP (2010, ISBN:1420066560), and the Bootstrap by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331).

Maintained by Linda Nab. Last updated 3 years ago.

linear-models measurement-error statistics

5.0 match 6 stars 5.07 score 13 scripts

bioc

TPP:Analyze thermal proteome profiling (TPP) experiments

Analyze thermal proteome profiling (TPP) experiments with varying temperatures (TR) or compound concentrations (CCR).

Maintained by Dorothee Childs. Last updated 5 months ago.

immunooncology proteomics massspectrometry

5.0 match 4.98 score 16 scripts

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 18 days ago.

openblas cpp openmp

2.0 match 147 stars 12.54 score 1.2k scripts 166 dependents

bioc

microbiome:Microbiome Analytics

Utilities for microbiome analysis.

Maintained by Leo Lahti. Last updated 5 months ago.

metagenomics microbiome sequencing systemsbiology hitchip hitchip-atlas human-microbiome microbiology microbiome-analysis phyloseq population-study

2.0 match 290 stars 12.50 score 2.0k scripts 5 dependents

llrs

experDesign:Design Experiments for Batches

Distributes samples in batches while making batches homogeneous according to their description. Allows for an arbitrary number of variables, both numeric and categorical. For quality control it provides functions to subset a representative sample.

Maintained by Lluís Revilla Sancho. Last updated 3 months ago.

batch experiment-design

4.5 match 10 stars 5.54 score 1 scripts

cran

metRology:Support for Metrological Applications

Provides classes and calculation and plotting functions for metrology applications, including measurement uncertainty estimation and inter-laboratory metrology comparison studies.

Maintained by Stephen L R Ellison. Last updated 2 months ago.

5.2 match 5 stars 4.77 score 223 scripts 7 dependents

mlr-org

mlr3pipelines:Preprocessing Operators and Pipelines for 'mlr3'

Dataflow programming toolkit that enriches 'mlr3' with a diverse set of pipelining operators ('PipeOps') that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can therefore be resampled, benchmarked, and tuned.

Maintained by Martin Binder. Last updated 9 days ago.

bagging data-science dataflow-programming ensemble-learning machine-learning mlr3 pipelines preprocessing stacking

2.0 match 141 stars 12.36 score 448 scripts 7 dependents

pachadotdev

cpp11armadillo:An 'Armadillo' Interface

Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.

Maintained by Mauricio Vargas Sepulveda. Last updated 26 days ago.

armadillo cpp cpp11 hacktoberfest linear-algebra

2.7 match 9 stars 9.14 score 1 scripts 16 dependents

tushiqi

MAnorm2:Tools for Normalizing and Comparing ChIP-seq Samples

Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the premier technology for profiling genome-wide localization of chromatin-binding proteins, including transcription factors and histones with various modifications. This package provides a robust method for normalizing ChIP-seq signals across individual samples or groups of samples. It also designs a self-contained system of statistical models for calling differential ChIP-seq signals between two or more biological conditions as well as for calling hypervariable ChIP-seq signals across samples. Refer to Tu et al. (2021) <doi:10.1101/gr.262675.120> and Chen et al. (2022) <doi:10.1186/s13059-022-02627-9> for associated statistical details.

Maintained by Shiqi Tu. Last updated 2 years ago.

chip-seq differential-analysis empirical-bayes winsorize-values

4.5 match 32 stars 5.48 score 19 scripts

jlp-bioinf

rnaCrosslinkOO:Analysis of RNA Crosslinking Data

Analysis of RNA crosslinking data for RNA structure prediction. The package is suitable for the analysis of RNA structure cross-linking data and chemical probing data.

Maintained by Jonathan Price. Last updated 2 months ago.

comrades psoralen rna-crosslinking rna-structure rna-structure-prediction

4.7 match 1 stars 5.22 score 3 scripts

christophergandrud

mcreplicate:Multi-Core Replicate

Multi-core replication function to make it easier to do fast Monte Carlo simulation. Based on the mcreplicate() function from the 'rethinking' package. The 'rethinking' package requires installing 'rstan', which is onerous to install, while also not adding capabilities to this function.

Maintained by Christopher Gandrud. Last updated 4 years ago.

parallel-computing simulation

5.9 match 5 stars 4.16 score 29 scripts

ekstroem

MethComp:Analysis of Agreement in Method Comparison Studies

Methods (standard and advanced) for analysis of agreement between measurement methods. These cover Bland-Altman plots, Deming regression, Lin's Total deviation index, and difference-on-average regression. See Carstensen B. (2010) "Comparing Clinical Measurement Methods: A Practical Guide (Statistics in Practice)" <doi:10.1002/9780470683019> for more information.

Maintained by Claus Thorn Ekstrøm. Last updated 5 months ago.

5.2 match 1 stars 4.63 score 86 scripts

jandraor

readsdr:Translate Models from System Dynamics Software into 'R'

The goal of 'readsdr' is to bridge the design capabilities from specialised System Dynamics software with the powerful numerical tools offered by 'R' libraries. The package accomplishes this goal by parsing 'XMILE' files ('Vensim' and 'Stella') models into 'R' objects to construct networks (graph theory); 'ODE' functions for 'Stan'; and inputs to simulate via 'deSolve' as described in Duggan (2016) <doi:10.1007/978-3-319-34043-2>.

Maintained by Jair Andrade. Last updated 10 months ago.

stan system-dynamics

3.6 match 19 stars 6.62 score 62 scripts

gergness

srvyr:'dplyr'-Like Syntax for Summary Statistics of Survey Data

Use piping, verbs like 'group_by' and 'summarize', and other 'dplyr' inspired syntactic style when calculating summary statistics on survey data using functions from the 'survey' package.

Maintained by Greg Freedman Ellis. Last updated 1 months ago.

survey

1.7 match 215 stars 13.88 score 1.8k scripts 15 dependents

willemsleegers

tidystats:Save Output of Statistical Tests

Save the output of statistical tests in an organized file that can be shared with others or used to report statistics in scientific papers.

Maintained by Willem Sleegers. Last updated 8 months ago.

3.5 match 19 stars 6.80 score 83 scripts

bioc

spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.

Maintained by Jianhai Zhang. Last updated 4 months ago.

spatial visualization microarray sequencing geneexpression datarepresentation network clustering graphandnetwork cellbasedassays atacseq dnaseq tissuemicroarray singlecell cellbiology genetarget

3.8 match 5 stars 6.26 score 12 scripts

stamats

MKmisc:Miscellaneous Functions from M. Kohl

Contains several functions for statistical data analysis; e.g. for sample size and power calculations, computation of confidence intervals and tests, and generation of similarity matrices.

Maintained by Matthias Kohl. Last updated 2 years ago.

3.2 match 11 stars 7.40 score 129 scripts 1 dependents

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

1.9 match 79 stars 12.62 score 186 scripts 9 dependents

bioc

LPE:Methods for analyzing microarray data using Local Pooled Error (LPE) method

This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional 'BH' or 'BY' procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.

Maintained by Nitin Jain. Last updated 5 months ago.

microarray differentialexpression

5.2 match 4.58 score 21 scripts 1 dependents

revelle

psych:Procedures for Psychological, Psychometric, and Personality Research

A general purpose toolbox developed originally for personality, psychometric theory and experimental psychology. Functions are primarily for multivariate analysis and scale construction using factor analysis, principal component analysis, cluster analysis and reliability analysis, although others provide basic descriptive statistics. Item Response Theory is done using factor analysis of tetrachoric and polychoric correlations. Functions for analyzing data at multiple levels include within and between group statistics, including correlations and factor analysis. Validation and cross validation of scales developed using basic machine learning algorithms are provided, as are functions for simulating and testing particular item and test structures. Several functions serve as a useful front end for structural equation modeling. Graphical displays of path diagrams, including mediation models, factor analysis and structural equation models are created using basic graphics. Some of the functions are written to support a book on psychometric theory as well as publications in personality research. For more information, see the <https://personality-project.org/r/> web page.

Maintained by William Revelle. Last updated 3 months ago.

1.7 match 52 stars 13.94 score 29k scripts 317 dependents

bioc

IsoformSwitchAnalyzeR:Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data

Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.

Maintained by Kristoffer Vitting-Seerup. Last updated 5 months ago.

geneexpression transcription alternativesplicing differentialexpression differentialsplicing visualization statisticalmethod transcriptomevariant biomedicalinformatics functionalgenomics systemsbiology transcriptomics rnaseq annotation functionalprediction geneprediction dataimport multiplecomparison batcheffect immunooncology

2.5 match 108 stars 9.26 score 125 scripts

rapidsurveys

bbw:Blocked Weighted Bootstrap

The blocked weighted bootstrap (BBW) is an estimation technique for use with data from two-stage cluster sampled surveys in which either prior weighting (e.g. population-proportional sampling or PPS as used in Standardized Monitoring and Assessment of Relief and Transitions or SMART surveys) or posterior weighting (e.g. as used in rapid assessment method or RAM and simple spatial sampling method or S3M surveys) is implemented. See Cameron et al (2008) <doi:10.1162/rest.90.3.414> for application of bootstrap to cluster samples. See Aaron et al (2016) <doi:10.1371/journal.pone.0163176> and Aaron et al (2016) <doi:10.1371/journal.pone.0162462> for application of the blocked weighted bootstrap to estimate indicators from two-stage cluster sampled surveys.

Maintained by Ernest Guevarra. Last updated 2 months ago.

bootstrapping-statistics ram surveys

4.1 match 3 stars 5.61 score 9 scripts 1 dependents

bioc

TADCompare:TADCompare: Identification and characterization of differential TADs

TADCompare is an R package designed to identify and characterize differential Topologically Associated Domains (TADs) between multiple Hi-C contact matrices. It contains functions for finding differential TADs between two datasets, finding differential TADs over time and identifying consensus TADs across multiple matrices. It takes all of the main types of HiC input and returns simple, comprehensive, easy to analyze results.

Maintained by Mikhail Dozmorov. Last updated 5 months ago.

software hic sequencing featureextraction clustering

3.3 match 23 stars 7.04 score 10 scripts

r-forge

Sleuth3:Data Sets from Ramsey and Schafer's "Statistical Sleuth (3rd Ed)"

Data sets from Ramsey, F.L. and Schafer, D.W. (2013), "The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)", Cengage Learning.

Maintained by Berwin A Turlach. Last updated 1 years ago.

3.6 match 6.38 score 522 scripts

bioc

ISoLDE:Integrative Statistics of alleLe Dependent Expression

This package provides ISoLDE a new method for identifying imprinted genes. This method is dedicated to data arising from RNA sequencing technologies. The ISoLDE package implements original statistical methodology described in the publication below.

Maintained by Christelle Reynès. Last updated 5 months ago.

immunooncology geneexpression transcription genesetenrichment genetics sequencing rnaseq multiplecomparison snp geneticvariability epigenetics mathematicalbiology generegulation openmp

10.0 match 2.30 score 2 scripts

biooss

sensitivity:Global Sensitivity Analysis of Model Outputs and Importance Measures

A collection of functions for sensitivity analysis of model outputs (factor screening, global sensitivity analysis and robustness analysis), for variable importance measures of data, as well as for interpretability of machine learning models. Most of the functions have to be applied on scalar output, but several functions support multi-dimensional outputs.

Maintained by Bertrand Iooss. Last updated 7 months ago.

cpp

3.4 match 17 stars 6.74 score 472 scripts 8 dependents

murrayefford

secr:Spatially Explicit Capture-Recapture

Functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.

Maintained by Murray Efford. Last updated 5 hours ago.

cpp

2.3 match 3 stars 10.16 score 410 scripts 5 dependents

jranke

chemCal:Calibration Functions for Analytical Chemistry

Simple functions for plotting linear calibration functions and estimating standard errors for measurements according to the Handbook of Chemometrics and Qualimetrics: Part A by Massart et al. (1997) There are also functions estimating the limit of detection (LOD) and limit of quantification (LOQ). The functions work on model objects from - optionally weighted - linear regression (lm) or robust linear regression ('rlm' from the 'MASS' package).

Maintained by Johannes Ranke. Last updated 2 months ago.

3.5 match 6 stars 6.52 score 55 scripts

simecek

additivityTests:Additivity Tests in the Two Way Anova with Single Sub-class Numbers

Implementation of the Tukey, Mandel, Johnson-Graybill, LBI, Tusell and modified Tukey non-additivity tests.

Maintained by Petr Simecek. Last updated 10 years ago.

4.1 match 1 stars 5.57 score 10 scripts 17 dependents

bioc

autonomics:Unified Statistical Modeling of Omics Data

This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.

Maintained by Aditya Bhagwat. Last updated 2 months ago.

software dataimport preprocessing dimensionreduction principalcomponent regression differentialexpression genesetenrichment transcriptomics transcription geneexpression rnaseq microarray proteomics metabolomics massspectrometry

3.8 match 5.95 score 5 scripts

dmmelamed

catregs:Post-Estimation Functions for Generalized Linear Mixed Models

Several functions for working with mixed effects regression models for limited dependent variables. The functions facilitate post-estimation of model predictions or margins, and comparisons between model predictions for assessing or probing moderation. Additional helper functions facilitate model comparisons and implements simulation-based inference for model predictions of alternative-specific outcome models. See also, Melamed and Doan (2024, ISBN: 978-1032509518).

Maintained by David Melamed. Last updated 8 months ago.

6.6 match 3.40 score 28 scripts

nspyrison

spinifex:Manual Tours, Manual Control of Dynamic Projections of Numeric Multivariate Data

Data visualization tours animates linear projection of multivariate data as its basis (ie. orientation) changes. The 'spinifex' packages generates paths for manual tours by manipulating the contribution of a single variable at a time Cook & Buja (1997) <doi:10.1080/10618600.1997.10474754>. Other types of tours, such as grand (random walk) and guided (optimizing some objective function) are available in the 'tourr' package Wickham et al. <doi:10.18637/jss.v040.i02>. 'spinifex' builds on 'tourr' and can render tours with 'gganimate' and 'plotly' graphics, and allows for exporting as an .html widget and as an .gif, respectively. This work is fully discussed in Spyrison & Cook (2020) <doi:10.32614/RJ-2020-027>.

Maintained by Nicholas Spyrison. Last updated 2 months ago.

dimension reduction tours visualization

3.6 match 3 stars 6.28 score 105 scripts 1 dependents

bcastanho

SCtools:Extensions for Synthetic Controls Analysis

Extends the functionality of the package 'Synth' as detailed in Abadie, Diamond, and Hainmueller (2011) <doi:10.18637/jss.v042.i13>. Includes generating and plotting placebos, post/pre-MSPE (Mean Squared Prediction Error) significance tests and plots, and calculating average treatment effects for multiple treated units.

Maintained by Bruno Castanho Silva. Last updated 11 months ago.

3.3 match 13 stars 6.74 score 105 scripts

philchalmers

SimDesign:Structure for Organizing Monte Carlo Simulation Designs

Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.

Maintained by Phil Chalmers. Last updated 5 hours ago.

monte-carlo-simulation simulation simulation-framework

1.7 match 62 stars 13.38 score 253 scripts 46 dependents

matthieu-bruneaux

isotracer:Isotopic Tracer Analysis Using MCMC

Implements Bayesian models to analyze data from tracer addition experiments. The implemented method was originally described in the article "A New Method to Reconstruct Quantitative Food Webs and Nutrient Flows from Isotope Tracer Addition Experiments" by López-Sepulcre et al. (2020) <doi:10.1086/708546>.

Maintained by Matthieu Bruneaux. Last updated 4 months ago.

cpp

3.8 match 5.92 score 60 scripts

discoleo

Rpdb:Read, Write, Visualize and Manipulate PDB Files

Provides tools to read, write, visualize Protein Data Bank (PDB) files and perform some structural manipulations.

Maintained by Leonard Mada. Last updated 26 days ago.

5.0 match 4.43 score 68 scripts

jhstaudacher

EvolutionaryGames:Important Concepts of Evolutionary Game Theory

Evolutionary game theory applies game theory to evolving populations in biology, see e.g. one of the books by Weibull (1994, ISBN:978-0262731218) or by Sandholm (2010, ISBN:978-0262195874) for more details. A comprehensive set of tools to illustrate the core concepts of evolutionary game theory, such as evolutionary stability or various evolutionary dynamics, for teaching and academic research is provided.

Maintained by Jochen Staudacher. Last updated 3 years ago.

7.1 match 2 stars 3.11 score 32 scripts

bioc

SWATH2stats:Transform and Filter SWATH Data for Statistical Packages

This package is intended to transform SWATH data from the OpenSWATH software into a format readable by other statistics packages while performing filtering, annotation and FDR estimation.

Maintained by Peter Blattmann. Last updated 5 months ago.

proteomics annotation experimentaldesign preprocessing massspectrometry immunooncology

3.5 match 1 stars 6.30 score 22 scripts

kbhoehn

dowser:B Cell Receptor Phylogenetics Toolkit

Provides a set of functions for inferring, visualizing, and analyzing B cell phylogenetic trees. Provides methods to 1) reconstruct unmutated ancestral sequences, 2) build B cell phylogenetic trees using multiple methods, 3) visualize trees with metadata at the tips, 4) reconstruct intermediate sequences, 5) detect biased ancestor-descendant relationships among metadata types Workflow examples available at documentation site (see URL). Citations: Hoehn et al (2022) <doi:10.1371/journal.pcbi.1009885>, Hoehn et al (2021) <doi:10.1101/2021.01.06.425648>.

Maintained by Kenneth Hoehn. Last updated 2 months ago.

3.2 match 6.81 score 84 scripts

jamespeapen

ceas:Cellular Energetics Analysis Software

Measuring cellular energetics is essential to understanding a matrix’s (e.g. cell, tissue or biofluid) metabolic state. The Agilent Seahorse machine is a common method to measure real-time cellular energetics, but existing analysis tools are highly manual or lack functionality. The Cellular Energetics Analysis Software (ceas) R package fills this analytical gap by providing modular and automated Seahorse data analysis and visualization using the methods described by Mookerjee et al. (2017) <doi:10.1074/jbc.m116.774471>.

Maintained by Rachel House. Last updated 3 months ago.

4.3 match 1 stars 5.08 score 3 scripts

bioc

S4Arrays:Foundation of array-like containers in Bioconductor

The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation bioconductor-package core-package

2.0 match 5 stars 10.99 score 8 scripts 1.2k dependents

cran

samplingVarEst:Sampling Variance Estimation

Functions to calculate some point estimators and estimate their variance under unequal probability sampling without replacement. Single and two-stage sampling designs are considered. Some approximations for the second-order inclusion probabilities (joint inclusion probabilities) are available (sample and population based). A variety of Jackknife variance estimators are implemented. Almost every function is written in C (compiled) code for faster results. The functions incorporate some performance improvements for faster results with large datasets.

Maintained by Emilio Lopez Escobar. Last updated 2 years ago.

9.7 match 1 stars 2.27 score 62 scripts 1 dependents

bioc

gemini:GEMINI: Variational inference approach to infer genetic interactions from pairwise CRISPR screens

GEMINI uses log-fold changes to model sample-dependent and independent effects, and uses a variational Bayes approach to infer these effects. The inferred effects are used to score and identify genetic interactions, such as lethality and recovery. More details can be found in Zamanighomi et al. 2019 (in press).

Maintained by Sidharth Jain. Last updated 5 months ago.

software crispr bayesian dataimport computational-biology genetic-interactions

3.6 match 15 stars 6.02 score 9 scripts

fredhutch

gimap:Calculate Genetic Interactions for Paired CRISPR Targets

Helps find meaningful patterns in complex genetic experiments. First gimap takes data from paired CRISPR (Clustered regularly interspaced short palindromic repeats) screens that has been pre-processed to counts table of paired gRNA (guide Ribonucleic Acid) reads. The input data will have cell counts for how well cells grow (or don't grow) when different genes or pairs of genes are disabled. The output of the 'gimap' package is genetic interaction scores which are the distance between the observed CRISPR score and the expected CRISPR score. The expected CRISPR scores are what we expect for the CRISPR values to be for two unrelated genes. The further away an observed CRISPR score is from its expected score the more we suspect genetic interaction. The work in this package is based off of original research from the Alice Berger lab at Fred Hutchinson Cancer Center (2021) <doi:10.1016/j.celrep.2021.109597>.

Maintained by Candace Savonen. Last updated 4 days ago.

3.4 match 6.43 score 7 scripts

jamesramsay5

fda:Functional Data Analysis

These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <https://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/>.

Maintained by James Ramsay. Last updated 4 months ago.

1.8 match 3 stars 12.29 score 2.0k scripts 143 dependents

adaemmerp

lpirfs:Local Projections Impulse Response Functions

Provides functions to estimate and visualize linear as well as nonlinear impulse responses based on local projections by Jordà (2005) <doi:10.1257/0002828053828518>. The methods and the package are explained in detail in Adämmer (2019) <doi:10.32614/RJ-2019-052>.

Maintained by Philipp Adämmer. Last updated 20 hours ago.

openblas cpp

3.3 match 44 stars 6.38 score 108 scripts

cmollica

PLMIX:Bayesian Analysis of Finite Mixture of Plackett-Luce Models

Fit finite mixtures of Plackett-Luce models for partial top rankings/orderings within the Bayesian framework. It provides MAP point estimates via EM algorithm and posterior MCMC simulations via Gibbs Sampling. It also fits MLE as a special case of the noninformative Bayesian analysis with vague priors. In addition to inferential techniques, the package assists other fundamental phases of a model-based analysis for partial rankings/orderings, by including functions for data manipulation, simulation, descriptive summary, model selection and goodness-of-fit evaluation. Main references on the methods are Mollica and Tardella (2017) <doi.org/10.1007/s11336-016-9530-0> and Mollica and Tardella (2014) <doi/10.1002/sim.6224>.

Maintained by Cristina Mollica. Last updated 4 years ago.

cpp

6.7 match 3.15 score 28 scripts

davidbolin

MetricGraph:Random Fields on Metric Graphs

Facilitates creation and manipulation of metric graphs, such as street or river networks. Further facilitates operations and visualizations of data on metric graphs, and the creation of a large class of random fields and stochastic partial differential equations on such spaces. These random fields can be used for simulation, prediction and inference. In particular, linear mixed effects models including random field components can be fitted to data based on computationally efficient sparse matrix representations. Interfaces to the R packages 'INLA' and 'inlabru' are also provided, which facilitate working with Bayesian statistical models on metric graphs. The main references for the methods are Bolin, Simas and Wallin (2024) <doi:10.3150/23-BEJ1647>, Bolin, Kovacs, Kumar and Simas (2023) <doi:10.1090/mcom/3929> and Bolin, Simas and Wallin (2023) <doi:10.48550/arXiv.2304.03190> and <doi:10.48550/arXiv.2304.10372>.

Maintained by David Bolin. Last updated 7 days ago.

cpp

3.4 match 14 stars 6.06 score 275 scripts

timbeechey

opa:An Implementation of Ordinal Pattern Analysis

Quantifies hypothesis to data fit for repeated measures and longitudinal data, as described by Thorngate (1987) <doi:10.1016/S0166-4115(08)60083-7> and Grice et al., (2015) <doi:10.1177/2158244015604192>. Hypothesis and data are encoded as pairwise relative orderings which are then compared to determine the percentage of orderings in the data that are matched by the hypothesis.

Maintained by Timothy Beechey. Last updated 1 years ago.

data-analysis hypothesis-testing longitudinal ordinal rcpp repeated-measures statistics cpp

5.6 match 1 stars 3.70 score 2 scripts

config-i1

greybox:Toolbox for Model Building and Forecasting

Implements functions and instruments for regression model building and its application to forecasting. The main scope of the package is in variables selection and models specification for cases of time series data. This includes promotional modelling, selection between different dynamic regressions with non-standard distributions of errors, selection based on cross validation, solutions to the fat regression model problem and more. Models developed in the package are tailored specifically for forecasting purposes. So as a results there are several methods that allow producing forecasts from these models and visualising them.

Maintained by Ivan Svetunkov. Last updated 3 days ago.

forecasting model-selection model-selection-and-evaluation regression regression-models statistics cpp

1.9 match 30 stars 11.03 score 97 scripts 34 dependents

jmslab

eventstudyr:Estimation and Visualization of Linear Panel Event Studies

Estimates linear panel event study models. Plots coefficients following the recommendations in Freyaldenhoven et al. (2021) <doi:10.3386/w29170>. Includes sup-t bands, testing for key hypotheses, least wiggly path through the Wald region. Allows instrumental variables estimation following Freyaldenhoven et al. (2019) <doi:10.1257/aer.20180609>.

Maintained by Santiago Hermo. Last updated 1 months ago.

3.3 match 24 stars 6.20 score 19 scripts

r-forge

Sleuth2:Data Sets from Ramsey and Schafer's "Statistical Sleuth (2nd Ed)"

Data sets from Ramsey, F.L. and Schafer, D.W. (2002), "The Statistical Sleuth: A Course in Methods of Data Analysis (2nd ed)", Duxbury.

Maintained by Berwin A Turlach. Last updated 1 years ago.

3.6 match 5.70 score 191 scripts

bioc

MAST:Model-based Analysis of Single Cell Transcriptomics

Methods and models for handling zero-inflated single cell assay data.

Maintained by Andrew McDavid. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment rnaseq transcriptomics singlecell

1.6 match 230 stars 12.75 score 1.8k scripts 5 dependents

duolajiang

RCTrep:Validation of Estimates of Treatment Effects in Observational Data

Validates estimates of (conditional) average treatment effects obtained using observational data by a) making it easy to obtain and visualize estimates derived using a large variety of methods (G-computation, inverse propensity score weighting, etc.), and b) ensuring that estimates are easily compared to a gold standard (i.e., estimates derived from randomized controlled trials). 'RCTrep' offers a generic protocol for treatment effect validation based on four simple steps, namely, set-selection, estimation, diagnosis, and validation. 'RCTrep' provides a simple dashboard to review the obtained results. The validation approach is introduced by Shen, L., Geleijnse, G. and Kaptein, M. (2023) <doi:10.21203/rs.3.rs-2559287/v1>.

Maintained by Lingjie Shen. Last updated 2 years ago.

4.3 match 8 stars 4.68 score 12 scripts

fbartos

BayesTools:Tools for Bayesian Analyses

Provides tools for conducting Bayesian analyses and Bayesian model averaging (Kass and Raftery, 1995, <doi:10.1080/01621459.1995.10476572>, Hoeting et al., 1999, <doi:10.1214/ss/1009212519>). The package contains functions for creating a wide range of prior distribution objects, mixing posterior samples from 'JAGS' and 'Stan' models, plotting posterior distributions, and etc... The tools for working with prior distribution span from visualization, generating 'JAGS' and 'bridgesampling' syntax to basic functions such as rng, quantile, and distribution functions.

Maintained by František Bartoš. Last updated 2 months ago.

bayesian model-averaging

3.3 match 7 stars 6.06 score 17 scripts 3 dependents

bioc

multiHiCcompare:Normalize and detect differences between Hi-C datasets when replicates of each experimental condition are available

multiHiCcompare provides functions for joint normalization and difference detection in multiple Hi-C datasets. This extension of the original HiCcompare package now allows for Hi-C experiments with more than 2 groups and multiple samples per group. multiHiCcompare operates on processed Hi-C data in the form of sparse upper triangular matrices. It accepts four column (chromosome, region1, region2, IF) tab-separated text files storing chromatin interaction matrices. multiHiCcompare provides cyclic loess and fast loess (fastlo) methods adapted to jointly normalizing Hi-C data. Additionally, it provides a general linear model (GLM) framework adapting the edgeR package to detect differences in Hi-C data in a distance dependent manner.

Maintained by Mikhail Dozmorov. Last updated 5 months ago.

software hic sequencing normalization

2.8 match 9 stars 7.30 score 37 scripts 2 dependents

mmi-codex

Xcertainty:Estimating Lengths and Uncertainty from Photogrammetric Imagery

Implementation of Bayesian models for estimating object lengths and morphological relationships between object lengths using photographic data collected from drones. The Bayesian model is described in "Bayesian approach for predicting photogrammetric uncertainty in morphometric measurements derived from drones" (Bierlich et al., 2021, <doi:10.3354/meps13814>).

Maintained by K.C. Bierlich. Last updated 5 months ago.

3.4 match 3 stars 5.95 score 10 scripts

vwendy

pompom:Person-Oriented Method and Perturbation on the Model

An implementation of a hybrid method of person-oriented method and perturbation on the model. Pompom is the initials of the two methods. The hybrid method will provide a multivariate intraindividual variability metric (iRAM). The person-oriented method used in this package refers to uSEM (unified structural equation modeling, see Kim et al., 2007, Gates et al., 2010 and Gates et al., 2012 for details). Perturbation on the model was conducted according to impulse response analysis introduced in Lutkepohl (2007). Kim, J., Zhu, W., Chang, L., Bentler, P. M., & Ernst, T. (2007) <doi:10.1002/hbm.20259>. Gates, K. M., Molenaar, P. C. M., Hillary, F. G., Ram, N., & Rovine, M. J. (2010) <doi:10.1016/j.neuroimage.2009.12.117>. Gates, K. M., & Molenaar, P. C. M. (2012) <doi:10.1016/j.neuroimage.2012.06.026>. Lutkepohl, H. (2007, ISBN:3540262393).

Maintained by Xiao Yang. Last updated 4 years ago.

6.5 match 3.08 score 24 scripts

bioc

methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results

methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.

Maintained by Altuna Akalin. Last updated 17 days ago.

dnamethylation sequencing methylseq genome-biology methylation statistical-analysis visualization curl bzip2 xz-utils zlib cpp

1.7 match 220 stars 11.80 score 578 scripts 3 dependents

martynplummer

coda:Output Analysis and Diagnostics for MCMC

Provides functions for summarizing and plotting the output from Markov Chain Monte Carlo (MCMC) simulations, as well as diagnostic tests of convergence to the equilibrium distribution of the Markov chain.

Maintained by Martyn Plummer. Last updated 1 years ago.

1.8 match 6 stars 11.33 score 8.3k scripts 1.1k dependents

kkawato

rdlearn:Safe Policy Learning under Regression Discontinuity Design with Multiple Cutoffs

Implements safe policy learning under regression discontinuity designs with multiple cutoffs, based on Zhang et al. (2022) <doi:10.48550/arXiv.2208.13323>. The learned cutoffs are guaranteed to perform no worse than the existing cutoffs in terms of overall outcomes. The 'rdlearn' package also includes features for visualizing the learned cutoffs relative to the baseline and conducting sensitivity analyses.

Maintained by Kentaro Kawato. Last updated 24 days ago.

3.8 match 1 stars 5.26 score 4 scripts

bioc

NADfinder:Call wide peaks for sequencing data

Nucleolus is an important structure inside the nucleus in eukaryotic cells. It is the site for transcribing rDNA into rRNA and for assembling ribosomes, aka ribosome biogenesis. In addition, nucleoli are dynamic hubs through which numerous proteins shuttle and contact specific non-rDNA genomic loci. Deep sequencing analyses of DNA associated with isolated nucleoli (NAD- seq) have shown that specific loci, termed nucleolus- associated domains (NADs) form frequent three- dimensional associations with nucleoli. NAD-seq has been used to study the biological functions of NAD and the dynamics of NAD distribution during embryonic stem cell (ESC) differentiation. Here, we developed a Bioconductor package NADfinder for bioinformatic analysis of the NAD-seq data, including baseline correction, smoothing, normalization, peak calling, and annotation.

Maintained by Jianhong Ou. Last updated 2 months ago.

sequencing dnaseq generegulation peakdetection

4.7 match 4.18 score 1 scripts

bioc

RNAmodR:Detection of post-transcriptional modifications in high throughput sequencing data

RNAmodR provides classes and workflows for loading/aggregation data from high througput sequencing aimed at detecting post-transcriptional modifications through analysis of specific patterns. In addition, utilities are provided to validate and visualize the results. The RNAmodR package provides a core functionality from which specific analysis strategies can be easily implemented as a seperate package.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

software infrastructure workflowstep visualization sequencing alkanilineseq bioconductor modifications ribomethseq rna rnamodr

3.0 match 3 stars 6.51 score 9 scripts 3 dependents

bioc

maSigPro:Significant Gene Expression Profile Differences in Time Course Gene Expression Data

maSigPro is a regression based approach to find genes for which there are significant gene expression profile differences between experimental groups in time course microarray and RNA-Seq experiments.

Maintained by Maria Jose Nueda. Last updated 5 months ago.

microarray rna-seq differential expression timecourse

3.8 match 5.18 score 76 scripts

mightymetrika

scdtb:Single Case Design Tools

In some situations where researchers would like to demonstrate causal effects, it is hard to obtain a sample size that would allow for a well-powered randomized controlled trial. Single case designs are experimental designs that can be used to demonstrate causal effects with only one participant or with only a few participants. The 'scdtb' package provides a suite of tools for analyzing data from studies that use single case designs. The nap() function can be used to compute the nonoverlap of all pairs as outlined by the What Works Clearinghouse (2022) <https://ies.ed.gov/ncee/wwc/Handbooks>. The package also offers the mixed_model_analysis() and cross_lagged() functions which implement mixed effects models and cross lagged analyses as described in Maric & van der Werff (2020) <doi:10.4324/9780429273872-9>. The randomization_test() function implements randomization tests based on methods presented in Onghena (2020) <doi:10.4324/9780429273872-8>. The scdtb() 'shiny' application can be used to upload single case design data and access various 'scdtb' tools for plotting and analysis.

Maintained by Mackson Ncube. Last updated 6 months ago.

data math science statistics

5.2 match 3.74 score 5 scripts

adeverse

adespatial:Multivariate Multiscale Spatial Analysis

Tools for the multiscale spatial analysis of multivariate data. Several methods are based on the use of a spatial weighting matrix and its eigenvector decomposition (Moran's Eigenvectors Maps, MEM). Several approaches are described in the review Dray et al (2012) <doi:10.1890/11-1183.1>.

Maintained by Aurélie Siberchicot. Last updated 13 days ago.

openblas

1.8 match 36 stars 11.06 score 398 scripts 2 dependents

cran

ILSAstats:Statistics for International Large-Scale Assessments (ILSA)

Calculates point estimates and standard errors using replicate weights and plausible values for International Large-Scale Assessments (ILSA), including: means, proportions, quantiles, correlations, singlelevel regressions, and multilevel regressions.

Maintained by Andrés Christiansen. Last updated 24 days ago.

19.3 match 1.00 score

pbs-assess

sdmTMB:Spatial and Spatiotemporal SPDE-Based GLMMs with 'TMB'

Implements spatial and spatiotemporal GLMMs (Generalized Linear Mixed Effect Models) using 'TMB', 'fmesher', and the SPDE (Stochastic Partial Differential Equation) Gaussian Markov random field approximation to Gaussian random fields. One common application is for spatially explicit species distribution models (SDMs). See Anderson et al. (2024) <doi:10.1101/2022.03.24.485545>.

Maintained by Sean C. Anderson. Last updated 2 days ago.

ecology glmm spatial-analysis species-distribution-modelling tmb cpp

1.8 match 203 stars 10.71 score 848 scripts 1 dependents