Showing 45 of total 45 results (show query)
business-science
anomalize:Tidy Anomaly Detection
The 'anomalize' package enables a "tidy" workflow for detecting anomalies in data. The main functions are time_decompose(), anomalize(), and time_recompose(). When combined, it's quite simple to decompose time series, detect anomalies, and create bands separating the "normal" data from the anomalous data at scale (i.e. for multiple time series). Time series decomposition is used to remove trend and seasonal components via the time_decompose() function and methods include seasonal decomposition of time series by Loess ("stl") and seasonal decomposition by piecewise medians ("twitter"). The anomalize() function implements two methods for anomaly detection of residuals including using an inner quartile range ("iqr") and generalized extreme studentized deviation ("gesd"). These methods are based on those used in the 'forecast' package and the Twitter 'AnomalyDetection' package. Refer to the associated functions for specific references for these methods.
Maintained by Matt Dancho. Last updated 1 years ago.
anomalyanomaly-detectiondecompositiondetect-anomaliesiqrtime-series
17.5 match 339 stars 9.56 score 332 scriptsprojectmosaic
mosaic:Project MOSAIC Statistics and Mathematics Teaching Utilities
Data sets and utilities from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.
Maintained by Randall Pruim. Last updated 1 years ago.
6.6 match 93 stars 13.32 score 7.2k scripts 7 dependentsrapporter
rapportools:Miscellaneous (Stats) Helper Functions with Sane Defaults for Reporting
Helper functions that act as wrappers to more advanced statistical methods with the advantage of having sane defaults for quick reporting.
Maintained by Gergely Daróczi. Last updated 16 days ago.
8.9 match 8 stars 7.50 score 186 scripts 11 dependentshenrikbengtsson
matrixStats:Functions that Apply to Rows and Columns of Matrices (and to Vectors)
High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().
Maintained by Henrik Bengtsson. Last updated 2 months ago.
3.3 match 208 stars 18.09 score 20k scripts 2.3k dependentsmikejareds
hermiter:Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)
Facilitates estimation of full univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation (bivariate) using Hermite series based estimators. These estimators are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. Based on: Stephanou, Michael, Varughese, Melvin and Macdonald, Iain. "Sequential quantiles via Hermite series density estimation." Electronic Journal of Statistics 11.1 (2017): 570-607 <doi:10.1214/17-EJS1245>, Stephanou, Michael and Varughese, Melvin. "On the properties of Hermite series based distribution function estimators." Metrika (2020) <doi:10.1007/s00184-020-00785-z> and Stephanou, Michael and Varughese, Melvin. "Sequential estimation of Spearman rank correlation using Hermite series estimators." Journal of Multivariate Analysis (2021) <doi:10.1016/j.jmva.2021.104783>.
Maintained by Michael Stephanou. Last updated 7 months ago.
cumulative-distribution-functionkendall-correlation-coefficientonline-algorithmsprobability-density-functionquantilespearman-correlation-coefficientstatisticsstreaming-algorithmsstreaming-datacpp
8.9 match 15 stars 5.58 score 17 scriptsbioc
IRanges:Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
3.3 match 22 stars 15.09 score 2.1k scripts 1.8k dependentsbioc
BiocGenerics:S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructurebioconductor-packagecore-package
3.3 match 12 stars 14.22 score 612 scripts 2.2k dependentsolink-proteomics
OlinkAnalyze:Facilitate Analysis of Proteomic Data from Olink
A collection of functions to facilitate analysis of proteomic data from Olink, primarily NPX data that has been exported from Olink Software. The functions also work on QUANT data from Olink by log- transforming the QUANT data. The functions are focused on reading data, facilitating data wrangling and quality control analysis, performing statistical analysis and generating figures to visualize the results of the statistical analysis. The goal of this package is to help users extract biological insights from proteomic data run on the Olink platform.
Maintained by Kathleen Nevola. Last updated 20 days ago.
olinkproteomicsproteomics-data-analysis
4.6 match 104 stars 9.72 score 61 scriptsmayoverse
arsenal:An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.
Maintained by Ethan Heinzen. Last updated 7 months ago.
baseline-characteristicsdescriptive-statisticsmodelingpaired-comparisonsreportingstatisticstableone
3.3 match 225 stars 13.45 score 1.2k scripts 16 dependentsdcousin3
superb:Summary Plots with Adjusted Error Bars
Computes standard error and confidence interval of various descriptive statistics under various designs and sampling schemes. The main function, superb(), return a plot. It can also be used to obtain a dataframe with the statistics and their precision intervals so that other plotting environments (e.g., Excel) can be used. See Cousineau and colleagues (2021) <doi:10.1177/25152459211035109> or Cousineau (2017) <doi:10.5709/acp-0214-z> for a review as well as Cousineau (2005) <doi:10.20982/tqmp.01.1.p042>, Morey (2008) <doi:10.20982/tqmp.04.2.p061>, Baguley (2012) <doi:10.3758/s13428-011-0123-7>, Cousineau & Laurencelle (2016) <doi:10.1037/met0000055>, Cousineau & O'Brien (2014) <doi:10.3758/s13428-013-0441-z>, Calderini & Harding <doi:10.20982/tqmp.15.1.p001> for specific references.
Maintained by Denis Cousineau. Last updated 2 months ago.
error-barsplottingstatisticssummary-plotssummary-statisticsvisualization
4.5 match 19 stars 9.55 score 155 scripts 2 dependentsalexkowa
EnvStats:Package for Environmental Statistics, Including US EPA Guidance
Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).
Maintained by Alexander Kowarik. Last updated 17 days ago.
3.3 match 26 stars 12.80 score 2.4k scripts 46 dependentshandcock
reldist:Relative Distribution Methods
Tools for the comparison of distributions. This includes nonparametric estimation of the relative distribution PDF and CDF and numerical summaries as described in "Relative Distribution Methods in the Social Sciences" by Mark S. Handcock and Martina Morris, Springer-Verlag, 1999, Springer-Verlag, ISBN 0387987789.
Maintained by Mark S. Handcock. Last updated 2 years ago.
5.6 match 1 stars 6.70 score 344 scripts 7 dependentsjmcurran
Bolstad:Functions for Elementary Bayesian Inference
A set of R functions and data sets for the book Introduction to Bayesian Statistics, Bolstad, W.M. (2017), John Wiley & Sons ISBN 978-1-118-09156-2.
Maintained by James Curran. Last updated 5 months ago.
7.8 match 4.08 score 93 scriptsmayer79
confintr:Confidence Intervals
Calculates classic and/or bootstrap confidence intervals for many parameters such as the population mean, variance, interquartile range (IQR), median absolute deviation (MAD), skewness, kurtosis, Cramer's V, odds ratio, R-squared, quantiles (incl. median), proportions, different types of correlation measures, difference in means, quantiles and medians. Many of the classic confidence intervals are described in Smithson, M. (2003, ISBN: 978-0761924999). Bootstrap confidence intervals are calculated with the R package 'boot'. Both one- and two-sided intervals are supported.
Maintained by Michael Mayer. Last updated 8 months ago.
bootstrapconfidence-intervalsstatistical-inferencestatistics
2.8 match 15 stars 8.50 score 104 scripts 16 dependentsbioc
LPE:Methods for analyzing microarray data using Local Pooled Error (LPE) method
This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional 'BH' or 'BY' procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.
Maintained by Nitin Jain. Last updated 5 months ago.
microarraydifferentialexpression
5.0 match 4.58 score 21 scripts 1 dependentsdesctable
desctable:Produce Descriptive and Comparative Tables Easily
Easily create descriptive and comparative tables. It makes use and integrates directly with the tidyverse family of packages, and pipes. Tables are produced as (nested) dataframes for easy manipulation.
Maintained by Maxime Wack. Last updated 3 years ago.
3.3 match 52 stars 6.85 score 45 scriptsr-forge
distrEx:Extensions of Package 'distr'
Extends package 'distr' by functionals, distances, and conditional distributions.
Maintained by Matthias Kohl. Last updated 2 months ago.
3.3 match 6.68 score 107 scripts 17 dependentsbioc
BumpyMatrix:Bumpy Matrix of Non-Scalar Objects
Implements the BumpyMatrix class and several subclasses for holding non-scalar objects in each entry of the matrix. This is akin to a ragged array but the raggedness is in the third dimension, much like a bumpy surface - hence the name. Of particular interest is the BumpyDataFrameMatrix, where each entry is a Bioconductor data frame. This allows us to naturally represent multivariate data in a format that is compatible with two-dimensional containers like the SummarizedExperiment and MultiAssayExperiment objects.
Maintained by Aaron Lun. Last updated 3 months ago.
softwareinfrastructuredatarepresentation
3.3 match 1 stars 6.59 score 39 scripts 11 dependentstobiasschoch
robsurvey:Robust Survey Statistics Estimation
Robust (outlier-resistant) estimators of finite population characteristics like of means, totals, ratios, regression, etc. Available methods are M- and GM-estimators of regression, weight reduction, trimming, and winsorization. The package extends the 'survey' <https://CRAN.R-project.org/package=survey> package.
Maintained by Tobias Schoch. Last updated 3 months ago.
3.0 match 9 stars 6.16 score 5 scriptsirinagain
iglu:Interpreting Glucose Data from Continuous Glucose Monitors
Implements a wide range of metrics for measuring glucose control and glucose variability based on continuous glucose monitoring data. The list of implemented metrics is summarized in Rodbard (2009) <doi:10.1089/dia.2009.0015>. Additional visualization tools include time-series plots, lasagna plots and ambulatory glucose profile report.
Maintained by Irina Gaynanova. Last updated 10 days ago.
1.9 match 26 stars 9.00 score 39 scriptsstamats
MKdescr:Descriptive Statistics
Computation of standardized interquartile range (IQR), Huber-type skipped mean (Hampel (1985), <doi:10.2307/1268758>), robust coefficient of variation (CV) (Arachchige et al. (2019), <arXiv:1907.01110>), robust signal to noise ratio (SNR), z-score, standardized mean difference (SMD), as well as functions that support graphical visualization such as boxplots based on quartiles (not hinges), negative logarithms and generalized logarithms for 'ggplot2' (Wickham (2016), ISBN:978-3-319-24277-4).
Maintained by Matthias Kohl. Last updated 1 years ago.
2.7 match 3 stars 6.02 score 47 scripts 5 dependentsstamats
MKmisc:Miscellaneous Functions from M. Kohl
Contains several functions for statistical data analysis; e.g. for sample size and power calculations, computation of confidence intervals and tests, and generation of similarity matrices.
Maintained by Matthias Kohl. Last updated 2 years ago.
2.2 match 11 stars 7.40 score 129 scripts 1 dependentsjwiley
JWileymisc:Miscellaneous Utilities and Functions
Miscellaneous tools and functions, including: generate descriptive statistics tables, format output, visualize relations among variables or check distributions, and generic functions for residual and model diagnostics.
Maintained by Joshua F. Wiley. Last updated 2 months ago.
1.8 match 6 stars 7.42 score 241 scripts 4 dependentskriper0217
valmetrics:Metrics and Plots for Model Evaluation
Functions for metrics and plots for model evaluation. Based on vectors of observed and predicted values. Method: Kristin Piikki, Johanna Wetterlind, Mats Soderstrom and Bo Stenberg (2021). <doi:10.1111/SUM.12694>.
Maintained by Kristin Piikki. Last updated 4 years ago.
6.6 match 2.00 score 2 scriptsdzmitrygb
Repliscope:Replication Timing Profiling using DNA Copy Number
Create, Plot and Compare Replication Timing Profiles. The method is described in Muller et al., (2014) <doi: 10.1093/nar/gkt878>.
Maintained by Dzmitry G Batrakou. Last updated 3 years ago.
3.9 match 3.13 score 27 scriptsr-forge
RobExtremes:Optimally Robust Estimation for Extreme Value Distributions
Optimally robust estimation for extreme value distributions using S4 classes and methods (based on packages 'distr', 'distrEx', 'distrMod', 'RobAStBase', and 'ROptEst'); the underlying theoretic results can be found in Ruckdeschel and Horbenko, (2013 and 2012), \doi{10.1080/02331888.2011.628022} and \doi{10.1007/s00184-011-0366-4}.
Maintained by Peter Ruckdeschel. Last updated 2 months ago.
3.3 match 3.67 score 39 scriptslawremi
rsolr:R to Solr Interface
A comprehensive R API for querying Apache Solr databases. A Solr core is represented as a data frame or list that supports Solr-side filtering, sorting, transformation and aggregation, all through the familiar base R API. Queries are processed lazily, i.e., a query is only sent to the database when the data are required.
Maintained by Michael Lawrence. Last updated 3 years ago.
3.3 match 9 stars 3.65 score 6 scriptsnutriverse
nipnTK:National Information Platforms for Nutrition Anthropometric Data Toolkit
An implementation of the National Information Platforms for Nutrition or NiPN's analytic methods for assessing quality of anthropometric datasets that include measurements of weight, height or length, middle upper arm circumference, sex and age. The focus is on anthropometric status but many of the presented methods could be applied to other variables.
Maintained by Ernest Guevarra. Last updated 5 months ago.
anthropometrydata-qualitynipnnutrition
1.9 match 5 stars 5.92 score 28 scripts 1 dependentsagnesevardanega
LabRS:Laboratorio di Ricerca Sociale con R
Libreria di dati, scripts e funzioni che accompagna il libro "Ricerca sociale con R. Concetti e funzioni base per la ricerca sociale".
Maintained by Agnese Vardanega. Last updated 6 years ago.
3.3 match 1 stars 3.02 score 21 scriptscerte-medical-epidemiology
certestats:A Certe R Package for Statistical Modelling
A Certe R Package for early-warning, applying statistical modelling (such as creating machine learning models), QC rules and distribution analysis. This package is part of the 'certedata' universe.
Maintained by Matthijs S. Berends. Last updated 4 months ago.
3.3 match 3.02 score 1 scripts 1 dependentsbioc
spqn:Spatial quantile normalization
The spqn package implements spatial quantile normalization (SpQN). This method was developed to remove a mean-correlation relationship in correlation matrices built from gene expression data. It can serve as pre-processing step prior to a co-expression analysis.
Maintained by Yi Wang. Last updated 5 months ago.
networkinferencegraphandnetworknormalization
1.8 match 5 stars 5.04 score 22 scriptsabusjahn
wrappedtools:Useful Wrappers Around Commonly Used Functions
The main functionalities of 'wrappedtools' are: adding backticks to variable names; rounding to desired precision with special case for p-values; selecting columns based on pattern and storing their position, name, and backticked name; computing and formatting of descriptive statistics (e.g. mean±SD), comparing groups and creating publication-ready tables with descriptive statistics and p-values; creating specialized plots for correlation matrices. Functions were mainly written for my own daily work or teaching, but may be of use to others as well.
Maintained by Andreas Busjahn. Last updated 5 months ago.
descriptive-statisticstest-statistic
1.9 match 2 stars 4.70 score 8 scriptsqtalr
qtkit:Quantitative Text Kit
Support package for the textbook "An Introduction to Quantitative Text Analysis for Linguists: Reproducible Research Using R" (Francom, 2024) <doi:10.4324/9781003393764>. Includes functions to acquire, clean, and analyze text data as well as functions to document and share the results of text analysis. The package is designed to be used in conjunction with the book, but can also be used as a standalone package for text analysis.
Maintained by Jerid Francom. Last updated 2 months ago.
1.8 match 5.03 score 12 scriptsaravind-j
EvaluateCore:Quality Evaluation of Core Collections
Implements various quality evaluation statistics to assess the value of plant germplasm core collections using qualitative and quantitative phenotypic trait data according to Odong et al. (2015) <doi:10.1007/s00122-012-1971-y>.
Maintained by J. Aravind. Last updated 6 days ago.
core-collectionscore-evaluationgenebankgermplasmpgrplant-genetic-resources
2.0 match 1 stars 3.80 score 21 scriptsthomasgstewart
tangram.pipe:Row-by-Row Table Building
Builds tables with customizable rows. Users can specify the type of data to use for each row, as well as how to handle missing data and the types of comparison tests to run on the table columns.
Maintained by Andrew Guide. Last updated 3 years ago.
1.8 match 1 stars 3.60 score 1 scriptscran
SSDforR:Functions to Analyze Single System Data
Functions to visually and statistically analyze single system data.
Maintained by Charles Auerbach. Last updated 3 months ago.
4.3 match 1.48 scorebioc
cytoMEM:Marker Enrichment Modeling (MEM)
MEM, Marker Enrichment Modeling, automatically generates and displays quantitative labels for cell populations that have been identified from single-cell data. The input for MEM is a dataset that has pre-clustered or pre-gated populations with cells in rows and features in columns. Labels convey a list of measured features and the features' levels of relative enrichment on each population. MEM can be applied to a wide variety of data types and can compare between MEM labels from flow cytometry, mass cytometry, single cell RNA-seq, and spectral flow cytometry using RMSD.
Maintained by Jonathan Irish. Last updated 5 months ago.
proteomicssystemsbiologyclassificationflowcytometrydatarepresentationdataimportcellbiologysinglecellclustering
1.5 match 4.18 score 15 scriptscran
qrcm:Quantile Regression Coefficients Modeling
Parametric modeling of quantile regression coefficient functions.
Maintained by Paolo Frumento. Last updated 1 years ago.
3.3 match 1.78 score 2 dependentscran
ufs:A Collection of Utilities
This is a new version of the 'userfriendlyscience' package, which has grown a bit unwieldy. Therefore, distinct functionalities are being 'consciously uncoupled' into different packages. This package contains the general-purpose tools and utilities (see the 'behaviorchange' package, the 'rosetta' package, and the soon-to-be-released 'scd' package for other functionality), and is the most direct 'successor' of the original 'userfriendlyscience' package. For example, this package contains a number of basic functions to create higher level plots, such as diamond plots, to easily plot sampling distributions, to generate confidence intervals, to plan study sample sizes for confidence intervals, and to do some basic operations such as (dis)attenuate effect size estimates.
Maintained by Gjalt-Jorn Peters. Last updated 1 years ago.
1.8 match 2.95 score 3 dependentsbioc
ffpe:Quality assessment and control for FFPE microarray expression data
Identify low-quality data using metrics developed for expression data derived from Formalin-Fixed, Paraffin-Embedded (FFPE) data. Also a function for making Concordance at the Top plots (CAT-plots).
Maintained by Levi Waldron. Last updated 5 months ago.
microarraygeneexpressionqualitycontrol
1.7 match 3.03 score 27 scriptscran
Rrepest:An Analyzer of International Large Scale Assessments in Education
An easy way to analyze international large-scale assessments and surveys in education or any other dataset that includes replicated weights (Balanced Repeated Replication (BRR) weights, Jackknife replicate weights,...) while also allowing for analysis with multiply imputed variables (plausible values). It supports the estimation of univariate statistics (e.g. mean, variance, standard deviation, quantiles), frequencies, correlation, linear regression and any other model already implemented in R that takes a data frame and weights as parameters. It also includes options to prepare the results for publication, following the table formatting standards of the Organization for Economic Cooperation and Development (OECD).
Maintained by Rodolfo Ilizaliturri. Last updated 25 days ago.
2.3 match 1 stars 2.00 score 2 scriptscran
etable:Easy Table
Creates simple to highly customized tables for a wide selection of descriptive statistics, with or without weighting the data.
Maintained by Andreas Schulz. Last updated 4 years ago.
1.9 match 2.00 scorecran
gds:Descriptive Statistics of Grouped Data
Contains a function called gds() which accepts three input parameters like lower limits, upper limits and the frequencies of the corresponding classes. The gds() function calculate and return the values of mean ('gmean'), median ('gmedian'), mode ('gmode'), variance ('gvar'), standard deviation ('gstdev'), coefficient of variance ('gcv'), quartiles ('gq1', 'gq2', 'gq3'), inter-quartile range ('gIQR'), skewness ('g1'), and kurtosis ('g2') which facilitate effective data analysis. For skewness and kurtosis calculations we use moments.
Maintained by Partha Sarathi Bishnu. Last updated 4 years ago.
1.6 match 1.00 score