Showing 150 of total 150 results (show query)
nutriverse
zscorer:Child Anthropometry z-Score Calculator
A tool for calculating z-scores and centiles for weight-for-age, length/height-for-age, weight-for-length/height, BMI-for-age, head circumference-for-age, age circumference-for-age, subscapular skinfold-for-age, triceps skinfold-for-age based on the WHO Child Growth Standards.
Maintained by Ernest Guevarra. Last updated 4 years ago.
anthropometric-indicesanthropometrygrowth-chartsgrowth-standardsheight-for-agenutritionweight-for-ageweight-for-heightz-score
64.9 match 14 stars 7.30 score 47 scripts 1 dependentsd-score
dscore:D-Score for Child Development
The D-score summarizes the child's performance on a set of milestones into a single number. The package implements four Rasch model keys to convert milestone scores into a D-score. It provides tools to calculate the D-score and its precision from the child's milestone scores, to convert the D-score into the Development-for-Age Z-score (DAZ) using age-conditional references, and to map milestone names into a generic 9-position item naming convention.
Maintained by Stef van Buuren. Last updated 7 months ago.
child-developmentd-scoredazdevelopmental-trajectoriesgrowth-chartsrasch-modelcpp
53.4 match 8 stars 6.89 score 40 scriptsfbartos
zcurve:An Implementation of Z-Curves
An implementation of z-curves - a method for estimating expected discovery and replicability rates on the bases of test-statistics of published studies. The package provides functions for fitting the new density and EM version (Bartoลก & Schimmack, 2020, <doi:10.31234/osf.io/urgtn>), censored observations, as well as the original density z-curve (Brunner & Schimmack, 2020, <doi:10.15626/MP.2018.874>). Furthermore, the package provides summarizing and plotting functions for the fitted z-curve objects. See the aforementioned articles for more information about the z-curves, expected discovery and replicability rates, validation studies, and limitations.
Maintained by Frantiลกek Bartoลก. Last updated 10 months ago.
49.2 match 12 stars 5.48 score 21 scripts 1 dependentswlenhard
cNORM:Continuous Norming
A comprehensive toolkit for generating continuous test norms in psychometrics and biometrics, and analyzing model fit. The package offers both distribution-free modeling using Taylor polynomials and parametric modeling using the beta-binomial distribution. Originally developed for achievement tests, it is applicable to a wide range of mental, physical, or other test scores dependent on continuous or discrete explanatory variables. The package provides several advantages: It minimizes deviations from representativeness in subsamples, interpolates between discrete levels of explanatory variables, and significantly reduces the required sample size compared to conventional norming per age group. cNORM enables graphical and analytical evaluation of model fit, accommodates a wide range of scales including those with negative and descending values, and even supports conventional norming. It generates norm tables including confidence intervals. It also includes methods for addressing representativeness issues through Iterative Proportional Fitting.
Maintained by Wolfgang Lenhard. Last updated 4 months ago.
beta-binomialbiometricscontinuous-norminggrowth-curvenorm-scoresnorm-tablesnormalization-techniquespercentilepsychometricsregression-based-normingtaylor-series
34.1 match 2 stars 5.49 score 75 scriptsbioc
COCOA:Coordinate Covariation Analysis
COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.
Maintained by John Lawson. Last updated 5 months ago.
epigeneticsdnamethylationatacseqdnaseseqmethylseqmethylationarrayprincipalcomponentgenomicvariationgeneregulationgenomeannotationsystemsbiologyfunctionalgenomicschipseqsequencingimmunooncologydna-methylationpca
25.8 match 10 stars 7.02 score 21 scriptsannahutch
corrcoverage:Correcting the Coverage of Credible Sets from Bayesian Genetic Fine Mapping
Using a computationally efficient method, the package can be used to find the corrected coverage estimate of a credible set of putative causal variants from Bayesian genetic fine-mapping. The package can also be used to obtain a corrected credible set if required; that is, the smallest set of variants required such that the corrected coverage estimate of the resultant credible set is within some user defined accuracy of the desired coverage. Maller et al. (2012) <doi:10.1038/ng.2435>, Wakefield (2009) <doi:10.1002/gepi.20359>, Fortune and Wallace (2018) <doi:10.1093/bioinformatics/bty898>.
Maintained by Anna Hutchinson. Last updated 3 years ago.
46.8 match 6 stars 3.68 score 16 scriptsthlytras
rspiro:Implementation of Spirometry Equations
Implementation of various spirometry equations in R, currently the GLI-2012 (Global Lung Initiative; Quanjer et al. 2012 <doi:10.1183/09031936.00080312>), the race-neutral GLI global 2022 (Global Lung Initiative; Bowerman et al. 2023 <doi:10.1164/rccm.202205-0963OC>), the NHANES3 (National Health and Nutrition Examination Survey; Hankinson et al. 1999 <doi:10.1164/ajrccm.159.1.9712108>) and the JRS 2014 (Japanese Respiratory Society; Kubota et al. 2014 <doi:10.1016/j.resinv.2014.03.003>) equations. Also the GLI-2017 diffusing capacity equations <doi:10.1183/13993003.00010-2017> are implemented. Contains user-friendly functions to calculate predicted and LLN (Lower Limit of Normal) values for different spirometric parameters such as FEV1 (Forced Expiratory Volume in 1 second), FVC (Forced Vital Capacity), etc, and to convert absolute spirometry measurements to percent (%) predicted and z-scores.
Maintained by Theodore Lytras. Last updated 9 months ago.
30.7 match 15 stars 5.10 score 28 scriptsbioc
easier:Estimate Systems Immune Response from RNA-seq data
This package provides a workflow for the use of EaSIeR tool, developed to assess patients' likelihood to respond to ICB therapies providing just the patients' RNA-seq data as input. We integrate RNA-seq data with different types of prior knowledge to extract quantitative descriptors of the tumor microenvironment from several points of view, including composition of the immune repertoire, and activity of intra- and extra-cellular communications. Then, we use multi-task machine learning trained in TCGA data to identify how these descriptors can simultaneously predict several state-of-the-art hallmarks of anti-cancer immune response. In this way we derive cancer-specific models and identify cancer-specific systems biomarkers of immune response. These biomarkers have been experimentally validated in the literature and the performance of EaSIeR predictions has been validated using independent datasets form four different cancer types with patients treated with anti-PD1 or anti-PDL1 therapy.
Maintained by Oscar Lapuente-Santana. Last updated 5 months ago.
geneexpressionsoftwaretranscriptionsystemsbiologypathwaysgenesetenrichmentimmunooncologyepigeneticsclassificationbiomedicalinformaticsregressionexperimenthubsoftware
36.1 match 4.20 score 16 scriptsbioc
mdp:Molecular Degree of Perturbation calculates scores for transcriptome data samples based on their perturbation from controls
The Molecular Degree of Perturbation webtool quantifies the heterogeneity of samples. It takes a data.frame of omic data that contains at least two classes (control and test) and assigns a score to all samples based on how perturbed they are compared to the controls. It is based on the Molecular Distance to Health (Pankla et al. 2009), and expands on this algorithm by adding the options to calculate the z-score using the modified z-score (using median absolute deviation), change the z-score zeroing threshold, and look at genes that are most perturbed in the test versus control classes.
Maintained by Helder Nakaya. Last updated 5 months ago.
biomedicalinformaticsqualitycontroltranscriptomicssystemsbiologymicroarray
31.5 match 4.65 score 15 scriptslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 24 days ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
12.8 match 233 stars 9.84 score 185 scripts 1 dependentsasa12138
ReporterScore:Generalized Reporter Score-Based Enrichment Analysis for Omics Data
Inspired by the classic 'RSA', we developed the improved 'Generalized Reporter Score-based Analysis (GRSA)' method, implemented in the R package 'ReporterScore', along with comprehensive visualization methods and pathway databases. 'GRSA' is a threshold-free method that works well with all types of biomedical features, such as genes, chemical compounds, and microbial species. Importantly, the 'GRSA' supports multi-group and longitudinal experimental designs, because of the included multi-group-compatible statistical methods.
Maintained by Chen Peng. Last updated 2 months ago.
17.7 match 67 stars 6.79 score 13 scriptspetolau
TSrepr:Time Series Representations
Methods for representations (i.e. dimensionality reduction, preprocessing, feature extraction) of time series to help more accurate and effective time series data mining. Non-data adaptive, data adaptive, model-based and data dictated (clipped) representation methods are implemented. Also various normalisation methods (min-max, z-score, Box-Cox, Yeo-Johnson), and forecasting accuracy measures are implemented.
Maintained by Peter Laurinec. Last updated 5 years ago.
data-analysisdata-miningdata-mining-algorithmsdata-sciencerepresentationtime-seriestime-series-analysistime-series-classificationtime-series-clusteringtime-series-data-miningtime-series-representationscpp
16.5 match 97 stars 7.23 score 117 scriptsgamlss-dev
gamlss:Generalized Additive Models for Location Scale and Shape
Functions for fitting the Generalized Additive Models for Location Scale and Shape introduced by Rigby and Stasinopoulos (2005), <doi:10.1111/j.1467-9876.2005.00510.x>. The models use a distributional regression approach where all the parameters of the conditional distribution of the response variable are modelled using explanatory variables.
Maintained by Mikis Stasinopoulos. Last updated 4 months ago.
10.5 match 16 stars 11.23 score 2.0k scripts 49 dependentsropensci
gigs:Assess Fetal, Newborn, and Child Growth with International Standards
Convert between anthropometric measures and z-scores/centiles in multiple growth standards, and classify fetal, newborn, and child growth accordingly. With a simple interface to growth standards from the World Health Organisation and International Fetal and Newborn Growth Consortium for the 21st Century, gigs makes growth assessment easy and reproducible for clinicians, researchers and policy-makers.
Maintained by Simon R Parker. Last updated 25 days ago.
anthropometrygrowth-standardsintergrowthwho
26.7 match 4 stars 4.38 score 8 scriptsbioc
bioassayR:Cross-target analysis of small molecule bioactivity
bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.
Maintained by Thomas Girke. Last updated 5 months ago.
immunooncologymicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportbioinformaticsproteomicsmetabolomics
16.3 match 5 stars 6.70 score 46 scriptsbioc
INTACT:Integrate TWAS and Colocalization Analysis for Gene Set Enrichment Analysis
This package integrates colocalization probabilities from colocalization analysis with transcriptome-wide association study (TWAS) scan summary statistics to implicate genes that may be biologically relevant to a complex trait. The probabilistic framework implemented in this package constrains the TWAS scan z-score-based likelihood using a gene-level colocalization probability. Given gene set annotations, this package can estimate gene set enrichment using posterior probabilities from the TWAS-colocalization integration step.
Maintained by Jeffrey Okamoto. Last updated 5 months ago.
19.6 match 15 stars 5.47 score 13 scriptsbioc
PWMEnrich:PWM enrichment analysis
A toolkit of high-level functions for DNA motif scanning and enrichment analysis built upon Biostrings. The main functionality is PWM enrichment analysis of already known PWMs (e.g. from databases such as MotifDb), but the package also implements high-level functions for PWM scanning and visualisation. The package does not perform "de novo" motif discovery, but is instead focused on using motifs that are either experimentally derived or computationally constructed by other tools.
Maintained by Diego Diez. Last updated 5 months ago.
motifannotationsequencematchingsoftware
20.8 match 5.08 score 60 scriptsbioc
CatsCradle:This package provides methods for analysing spatial transcriptomics data and for discovering gene clusters
This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. 'CatsCradle' allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.
Maintained by Michael Shapiro. Last updated 1 months ago.
biologicalquestionstatisticalmethodgeneexpressionsinglecelltranscriptomicsspatial
16.1 match 3 stars 6.50 scorebluefoxr
COINr:Composite Indicator Construction and Analysis
A comprehensive high-level package, for composite indicator construction and analysis. It is a "development environment" for composite indicators and scoreboards, which includes utilities for construction (indicator selection, denomination, imputation, data treatment, normalisation, weighting and aggregation) and analysis (multivariate analysis, correlation plotting, short cuts for principal component analysis, global sensitivity analysis, and more). A composite indicator is completely encapsulated inside a single hierarchical list called a "coin". This allows a fast and efficient work flow, as well as making quick copies, testing methodological variations and making comparisons. It also includes many plotting options, both statistical (scatter plots, distribution plots) as well as for presenting results.
Maintained by William Becker. Last updated 2 months ago.
11.1 match 26 stars 9.07 score 73 scripts 1 dependentsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 9 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
12.6 match 105 stars 7.98 scorebiometry
bipartite:Visualising Bipartite Networks and Calculating Some (Ecological) Indices
Functions to visualise webs and calculate a series of indices commonly used to describe pattern in (ecological) webs. It focuses on webs consisting of only two levels (bipartite), e.g. pollination webs or predator-prey-webs. Visualisation is important to get an idea of what we are actually looking at, while the indices summarise different aspects of the web's topology.
Maintained by Carsten F. Dormann. Last updated 7 days ago.
9.0 match 37 stars 10.93 score 592 scripts 15 dependentsnutriverse
mwana:An Efficient Workflow for Plausibility Checks and Prevalence Analysis of Wasting in R
A simple and streamlined workflow for plausibility checks and prevalence analysis of wasting based on the Standardized Monitoring and Assessment of Relief and Transition (SMART) Methodology <https://smartmethodology.org/>, with application in R.
Maintained by Tomรกs Zaba. Last updated 1 months ago.
acute-malnutritionanthropometrymuacnutritionsmartsurveywasting
23.1 match 2 stars 4.23 score 6 scriptsbioc
sparrow:Take command of set enrichment analyses through a unified interface
Provides a unified interface to a variety of GSEA techniques from different bioconductor packages. Results are harmonized into a single object and can be interrogated uniformly for quick exploration and interpretation of results. Interactive exploration of GSEA results is enabled through a shiny app provided by a sparrow.shiny sibling package.
Maintained by Steve Lianoglou. Last updated 3 months ago.
genesetenrichmentpathwaysbioinformaticsgsea
14.7 match 21 stars 6.58 score 13 scriptsanttsou
qmj:Quality Scores for the Russell 3000
Produces quality scores for each of the US companies from the Russell 3000, following the approach described in "Quality Minus Junk" (Asness, Frazzini, & Pedersen, 2013) <http://www.aqr.com/library/working-papers/quality-minus-junk>. The package includes datasets for users who wish to view the most recently uploaded quality scores. It also provides tools to automatically gather relevant financials and stock price information, allowing users to update their data and customize their universe for further analysis.
Maintained by Yanrong Song. Last updated 26 days ago.
23.6 match 9 stars 4.03 score 2 scriptsstatist7
sitar:Super Imposition by Translation and Rotation Growth Curve Analysis
Functions for fitting and plotting SITAR (Super Imposition by Translation And Rotation) growth curve models. SITAR is a shape-invariant model with a regression B-spline mean curve and subject-specific random effects on both the measurement and age scales. The model was first described by Lindstrom (1995) <doi:10.1002/sim.4780141807> and developed as the SITAR method by Cole et al (2010) <doi:10.1093/ije/dyq115>.
Maintained by Tim Cole. Last updated 2 months ago.
10.7 match 13 stars 8.69 score 58 scripts 3 dependentsmlcollyer
RRPP:Linear Model Evaluation with Randomized Residuals in a Permutation Procedure
Linear model calculations are made for many random versions of data. Using residual randomization in a permutation procedure, sums of squares are calculated over many permutations to generate empirical probability distributions for evaluating model effects. Additionally, coefficients, statistics, fitted values, and residuals generated over many permutations can be used for various procedures including pairwise tests, prediction, classification, and model comparison. This package should provide most tools one could need for the analysis of high-dimensional data, especially in ecology and evolutionary biology, but certainly other fields, as well.
Maintained by Michael Collyer. Last updated 26 days ago.
9.1 match 4 stars 9.84 score 173 scripts 7 dependentsstefvanbuuren
AGD:Analysis of Growth Data
Tools for the analysis of growth data: to extract an LMS table from a gamlss object, to calculate the standard deviation scores and its inverse, and to superpose two wormplots from different models. The package contains a some varieties of reference tables, especially for The Netherlands.
Maintained by Stef van Buuren. Last updated 11 months ago.
anthropometrycdcdutchgrowthgrowth-chartslmswhoz-score
19.0 match 1 stars 4.38 score 48 scriptsbioc
regioneR:Association analysis of genomic regions based on permutation tests
regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other genomic features.
Maintained by Bernat Gel. Last updated 5 months ago.
geneticschipseqdnaseqmethylseqcopynumbervariation
8.8 match 9.00 score 2.7k scripts 21 dependentsropensci
assertr:Assertive Programming for R Analysis Pipelines
Provides functionality to assert conditions that have to be met so that errors in data used in analysis pipelines can fail quickly. Similar to 'stopifnot()' but more powerful, friendly, and easier for use in pipelines.
Maintained by Tony Fischetti. Last updated 11 months ago.
analysis-pipelineassertion-libraryassertion-methodsassertionspeer-reviewedpredicate-functions
6.8 match 478 stars 11.39 score 452 scripts 12 dependentsbioc
singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data
The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.
Maintained by Joshua David Campbell. Last updated 24 days ago.
singlecellgeneexpressiondifferentialexpressionalignmentclusteringimmunooncologybatcheffectnormalizationqualitycontroldataimportgui
7.4 match 181 stars 10.16 score 252 scriptsp-mq
bodycompref:Reference Values for CT-Assessed Body Composition
Get z-scores, percentiles, absolute values, and percent of predicted of a reference cohort. Functionality requires installing the data packages 'adiposerefdata' and 'musclerefdata'. For more information on the underlying research, please visit our website which also includes a graphical interface. The models and underlying data are described in Marquardt JP et al.(planned publication 2025; reserved doi 10.1097/RLI.0000000000001104), "Subcutaneous and Visceral adipose tissue Reference Values from Framingham Heart Study Thoracic and Abdominal CT", *Investigative Radiology* and Tonnesen PE et al. (2023), "Muscle Reference Values from Thoracic and Abdominal CT for Sarcopenia Assessment [column] The Framingham Heart Study", *Investigative Radiology*, <doi:10.1097/RLI.0000000000001012>.
Maintained by J. Peter Marquardt. Last updated 8 months ago.
14.6 match 4.54 score 2 scriptscarriedaymont
growthcleanr:Data Cleaner for Anthropometric Measurements
Identifies implausible anthropometric (e.g., height, weight) measurements in irregularly spaced longitudinal datasets, such as those from electronic health records.
Maintained by Carrie Daymont. Last updated 16 days ago.
9.8 match 14 stars 6.68 score 41 scripts 1 dependentsbioc
genArise:Microarray Analysis tool
genArise is an easy to use tool for dual color microarray data. Its GUI-Tk based environment let any non-experienced user performs a basic, but not simple, data analysis just following a wizard. In addition it provides some tools for the developer.
Maintained by IFC Development Team. Last updated 5 months ago.
microarraytwochannelpreprocessing
14.7 match 4.30 score 1 scriptsljohansson
NIPTeR:Fast and Accurate Trisomy Prediction in Non-Invasive Prenatal Testing
Fast and Accurate Trisomy Prediction in Non-Invasive Prenatal Testing.
Maintained by Lennart Johansson. Last updated 6 years ago.
14.7 match 4.30 score 4 scriptsatsa-es
MARSS:Multivariate Autoregressive State-Space Modeling
The MARSS package provides maximum-likelihood parameter estimation for constrained and unconstrained linear multivariate autoregressive state-space (MARSS) models, including partially deterministic models. MARSS models are a class of dynamic linear model (DLM) and vector autoregressive model (VAR) model. Fitting available via Expectation-Maximization (EM), BFGS (using optim), and 'TMB' (using the 'marssTMB' companion package). Functions are provided for parametric and innovations bootstrapping, Kalman filtering and smoothing, model selection criteria including bootstrap AICb, confidences intervals via the Hessian approximation or bootstrapping, and all conditional residual types. See the user guide for examples of dynamic factor analysis, dynamic linear models, outlier and shock detection, and multivariate AR-p models. Online workshops (lectures, eBook, and computer labs) at <https://atsa-es.github.io/>.
Maintained by Elizabeth Eli Holmes. Last updated 1 years ago.
multivariate-timeseriesstate-space-modelsstatisticstime-series
6.0 match 52 stars 10.34 score 596 scripts 3 dependentspboutros
OmicsQC:Nominating Quality Control Outliers in Genomic Profiling Studies
A method that analyzes quality control metrics from multi-sample genomic sequencing studies and nominates poor quality samples for exclusion. Per sample quality control data are transformed into z-scores and aggregated. The distribution of aggregated z-scores are modelled using parametric distributions. The parameters of the optimal model, selected either by goodness-of-fit statistics or user-designation, are used for outlier nomination. Two implementations of the Cosine Similarity Outlier Detection algorithm are provided with flexible parameters for dataset customization.
Maintained by Paul C. Boutros. Last updated 1 years ago.
30.3 match 2.00 score 2 scriptsnicholasjcooper
NCmisc:Miscellaneous Functions for Creating Adaptive Functions and Scripts
A set of handy functions. Includes a versatile one line progress bar, one line function timer with detailed output, time delay function, text histogram, object preview, CRAN package search, simpler package installer, Linux command install check, a flexible Mode function, top function, simulation of correlated data, and more.
Maintained by Nicholas Cooper. Last updated 2 years ago.
15.3 match 3.86 score 172 scripts 5 dependentsmrcieu
TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database
A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.
Maintained by Gibran Hemani. Last updated 10 days ago.
5.1 match 467 stars 11.23 score 1.7k scripts 1 dependentspmartr
pmartR:Panomics Marketplace - Quality Control and Statistical Analysis for Panomics Data
Provides functionality for quality control processing and statistical analysis of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level), lipidomic, and metabolomic data, as well as RNA-seq based count data and nuclear magnetic resonance (NMR) data. This includes data transformation, specification of groups that are to be compared against each other, filtering of features and/or samples, data normalization, data summarization (correlation, PCA), and statistical comparisons between defined groups. Implements methods described in: Webb-Robertson et al. (2014) <doi:10.1074/mcp.M113.030932>. Webb-Robertson et al. (2011) <doi:10.1002/pmic.201100078>. Matzke et al. (2011) <doi:10.1093/bioinformatics/btr479>. Matzke et al. (2013) <doi:10.1002/pmic.201200269>. Polpitiya et al. (2008) <doi:10.1093/bioinformatics/btn217>. Webb-Robertson et al. (2010) <doi:10.1021/pr1005247>.
Maintained by Lisa Bramer. Last updated 3 days ago.
data-summarizationlipidsmass-spectrometrymetabolitesmetabolomics-datapeptidesproteinsrna-seq-analysisopenblascpp
7.3 match 40 stars 7.69 score 144 scriptsr-forge
corpora:Statistics and Data Sets for Corpus Frequency Data
Utility functions for the statistical analysis of corpus frequency data. This package is a companion to the open-source course "Statistical Inference: A Gentle Introduction for Computational Linguists and Similar Creatures" ('SIGIL').
Maintained by Stephanie Evert. Last updated 1 months ago.
18.8 match 3.01 score 34 scriptsbioc
LEA:LEA: an R package for Landscape and Ecological Association Studies
LEA is an R package dedicated to population genomics, landscape genomics and genotype-environment association tests. LEA can run analyses of population structure and genome-wide tests for local adaptation, and also performs imputation of missing genotypes. The package includes statistical methods for estimating ancestry coefficients from large genotypic matrices and for evaluating the number of ancestral populations (snmf). It performs statistical tests using latent factor mixed models for identifying genetic polymorphisms that exhibit association with environmental gradients or phenotypic traits (lfmm2). In addition, LEA computes values of genetic offset statistics based on new or predicted environments (genetic.gap, genetic.offset). LEA is mainly based on optimized programs that can scale with the dimensions of large data sets.
Maintained by Olivier Francois. Last updated 6 days ago.
softwarestatistical methodclusteringregressionopenblas
8.3 match 6.63 score 534 scriptsstephenslab
susieR:Sum of Single Effects Linear Regression
Implements methods for variable selection in linear regression based on the "Sum of Single Effects" (SuSiE) model, as described in Wang et al (2020) <DOI:10.1101/501114> and Zou et al (2021) <DOI:10.1101/2021.11.03.467167>. These methods provide simple summaries, called "Credible Sets", for accurately quantifying uncertainty in which variables should be selected. The methods are motivated by genetic fine-mapping applications, and are particularly well-suited to settings where variables are highly correlated and detectable effects are sparse. The fitting algorithm, a Bayesian analogue of stepwise selection methods called "Iterative Bayesian Stepwise Selection" (IBSS), is simple and fast, allowing the SuSiE model be fit to large data sets (thousands of samples and hundreds of thousands of variables).
Maintained by Peter Carbonetto. Last updated 17 days ago.
5.1 match 197 stars 10.38 score 728 scripts 6 dependentsprojectmosaic
mosaic:Project MOSAIC Statistics and Mathematics Teaching Utilities
Data sets and utilities from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.
Maintained by Randall Pruim. Last updated 1 years ago.
4.0 match 93 stars 13.32 score 7.2k scripts 7 dependentscefet-rj-dal
daltoolbox:Leveraging Experiment Lines to Data Analytics
The natural increase in the complexity of current research experiments and data demands better tools to enhance productivity in Data Analytics. The package is a framework designed to address the modern challenges in data analytics workflows. The package is inspired by Experiment Line concepts. It aims to provide seamless support for users in developing their data mining workflows by offering a uniform data model and method API. It enables the integration of various data mining activities, including data preprocessing, classification, regression, clustering, and time series prediction. It also offers options for hyper-parameter tuning and supports integration with existing libraries and languages. Overall, the package provides researchers with a comprehensive set of functionalities for data science, promoting ease of use, extensibility, and integration with various tools and libraries. Information on Experiment Line is based on Ogasawara et al. (2009) <doi:10.1007/978-3-642-02279-1_20>.
Maintained by Eduardo Ogasawara. Last updated 1 months ago.
7.8 match 1 stars 6.65 score 536 scripts 4 dependentsbioc
metapod:Meta-Analyses on P-Values of Differential Analyses
Implements a variety of methods for combining p-values in differential analyses of genome-scale datasets. Functions can combine p-values across different tests in the same analysis (e.g., genomic windows in ChIP-seq, exons in RNA-seq) or for corresponding tests across separate analyses (e.g., replicated comparisons, effect of different treatment conditions). Support is provided for handling log-transformed input p-values, missing values and weighting where appropriate.
Maintained by Aaron Lun. Last updated 3 months ago.
multiplecomparisondifferentialpeakcallingcpp
6.8 match 7.44 score 17 scripts 46 dependentsbioc
dcanr:Differential co-expression/association network analysis
This package implements methods and an evaluation framework to infer differential co-expression/association networks. Various methods are implemented and can be evaluated using simulated datasets. Inference of differential co-expression networks can allow identification of networks that are altered between two conditions (e.g., health and disease).
Maintained by Dharmesh D. Bhuva. Last updated 5 months ago.
networkinferencegraphandnetworkdifferentialexpressionnetwork
6.5 match 6 stars 7.45 score 26 scripts 5 dependentsstamats
MKdescr:Descriptive Statistics
Computation of standardized interquartile range (IQR), Huber-type skipped mean (Hampel (1985), <doi:10.2307/1268758>), robust coefficient of variation (CV) (Arachchige et al. (2019), <arXiv:1907.01110>), robust signal to noise ratio (SNR), z-score, standardized mean difference (SMD), as well as functions that support graphical visualization such as boxplots based on quartiles (not hinges), negative logarithms and generalized logarithms for 'ggplot2' (Wickham (2016), ISBN:978-3-319-24277-4).
Maintained by Matthias Kohl. Last updated 1 years ago.
8.0 match 3 stars 6.02 score 47 scripts 5 dependentsbioc
edgeR:Empirical Analysis of Digital Gene Expression Data in R
Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.
Maintained by Yunshun Chen. Last updated 5 days ago.
alternativesplicingbatcheffectbayesianbiomedicalinformaticscellbiologychipseqclusteringcoveragedifferentialexpressiondifferentialmethylationdifferentialsplicingdnamethylationepigeneticsfunctionalgenomicsgeneexpressiongenesetenrichmentgeneticsimmunooncologymultiplecomparisonnormalizationpathwaysproteomicsqualitycontrolregressionrnaseqsagesequencingsinglecellsystemsbiologytimecoursetranscriptiontranscriptomicsopenblas
3.5 match 13.40 score 17k scripts 255 dependentsbioc
regioneReloaded:RegioneReloaded: Multiple Association for Genomic Region Sets
RegioneReloaded is a package that allows simultaneous analysis of associations between genomic region sets, enabling clustering of data and the creation of ready-to-publish graphs. It takes over and expands on all the features of its predecessor regioneR. It also incorporates a strategy to improve p-value calculations and normalize z-scores coming from multiple analysis to allow for their direct comparison. RegioneReloaded builds upon regioneR by adding new plotting functions for obtaining publication-ready graphs.
Maintained by Roberto Malinverni. Last updated 5 months ago.
geneticschipseqdnaseqmethylseqcopynumbervariationclusteringmultiplecomparison
10.9 match 5 stars 4.30 score 2 scriptsmuschellij2
neurobase:'Neuroconductor' Base Package with Helper Functions for 'nifti' Objects
Base package for 'Neuroconductor', which includes many helper functions that interact with objects of class 'nifti', implemented by package 'oro.nifti', for reading/writing and also other manipulation functions.
Maintained by John Muschelli. Last updated 1 months ago.
5.5 match 5 stars 8.49 score 486 scripts 7 dependentsbioc
cfdnakit:Fragmen-length analysis package from high-throughput sequencing of cell-free DNA (cfDNA)
This package provides basic functions for analyzing shallow whole-genome sequencing (~0.3X or more) of cell-free DNA (cfDNA). The package basically extracts the length of cfDNA fragments and aids the vistualization of fragment-length information. The package also extract fragment-length information per non-overlapping fixed-sized bins and used it for calculating ctDNA estimation score (CES).
Maintained by Pitithat Puranachot. Last updated 5 months ago.
copynumbervariationsequencingwholegenome
8.9 match 8 stars 5.20 score 8 scriptsbioc
ASSET:An R package for subset-based association analysis of heterogeneous traits and subtypes
An R package for subset-based analysis of heterogeneous traits and disease subtypes. The package allows the user to search through all possible subsets of z-scores to identify the subset of traits giving the best meta-analyzed z-score. Further, it returns a p-value adjusting for the multiple-testing involved in the search. It also allows for searching for the best combination of disease subtypes associated with each variant.
Maintained by Samsiddhi Bhattacharjee. Last updated 5 months ago.
statisticalmethodsnpgenomewideassociationmultiplecomparison
7.8 match 5.71 score 85 scripts 1 dependentsaljensen89
CommKern:Network-Based Communities and Kernel Machine Methods
Analysis of network community objects with applications to neuroimaging data. There are two main components to this package. The first is the hierarchical multimodal spinglass (HMS) algorithm, which is a novel community detection algorithm specifically tailored to the unique issues within brain connectivity. The other is a suite of semiparametric kernel machine methods that allow for statistical inference to be performed to test for potential associations between these community structures and an outcome of interest (binary or continuous).
Maintained by Alexandria Jensen. Last updated 2 years ago.
10.8 match 4.11 score 26 scriptsbioc
MOMA:Multi Omic Master Regulator Analysis
This package implements the inference of candidate master regulator proteins from multi-omics' data (MOMA) algorithm, as well as ancillary analysis and visualization functions.
Maintained by Sunny Jones. Last updated 5 months ago.
softwarenetworkenrichmentnetworkinferencenetworkfeatureextractionclusteringfunctionalgenomicstranscriptomicssystemsbiology
7.2 match 6 stars 6.19 score 13 scriptscran
fdrtool:Estimation of (Local) False Discovery Rates and Higher Criticism
Estimates both tail area-based false discovery rates (Fdr) as well as local false discovery rates (fdr) for a variety of null models (p-values, z-scores, correlation coefficients, t-scores). The proportion of null values and the parameters of the null distribution are adaptively estimated from the data. In addition, the package contains functions for non-parametric density estimation (Grenander estimator), for monotone regression (isotonic regression and antitonic regression with weights), for computing the greatest convex minorant (GCM) and the least concave majorant (LCM), for the half-normal and correlation distributions, and for computing empirical higher criticism (HC) scores and the corresponding decision threshold.
Maintained by Korbinian Strimmer. Last updated 7 months ago.
5.3 match 3 stars 8.24 score 844 scripts 118 dependentsxfim
ggmcmc:Tools for Analyzing MCMC Simulations from Bayesian Inference
Tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables, and functions to work with hierarchical/multilevel batches of parameters (Fernรกndez-i-Marรญn, 2016 <doi:10.18637/jss.v070.i09>).
Maintained by Xavier Fernรกndez i Marรญn. Last updated 2 years ago.
bayesian-data-analysisggplot2graphicaljagsmcmcstan
3.5 match 112 stars 12.02 score 1.6k scripts 8 dependentsbioc
RgnTX:Colocalization analysis of transcriptome elements in the presence of isoform heterogeneity and ambiguity
RgnTX allows the integration of transcriptome annotations so as to model the complex alternative splicing patterns. It supports the testing of transcriptome elements without clear isoform association, which is often the real scenario due to technical limitations. It involves functions that do permutaion test for evaluating association between features and transcriptome regions.
Maintained by Yue Wang. Last updated 5 months ago.
alternativesplicingsequencingrnaseqmethylseqtranscriptionsplicedalignment
10.2 match 4.00 score 6 scriptsbioc
dominoSignal:Cell Communication Analysis for Single Cell RNA Sequencing
dominoSignal is a package developed to analyze cell signaling through ligand - receptor - transcription factor networks in scRNAseq data. It takes as input information transcriptomic data, requiring counts, z-scored counts, and cluster labels, as well as information on transcription factor activation (such as from SCENIC) and a database of ligand and receptor pairings (such as from CellPhoneDB). This package creates an object storing ligand - receptor - transcription factor linkages by cluster and provides several methods for exploring, summarizing, and visualizing the analysis.
Maintained by Jacob T Mitchell. Last updated 5 months ago.
systemsbiologysinglecelltranscriptomicsnetwork
6.2 match 5 stars 6.50 score 5 scriptsjasonmoy28
psycCleaning:Data Cleaning for Psychological Analyses
Useful for preparing and cleaning data. It includes functions to center data, reverse coding, dummy code and effect code data, and more.
Maintained by Jason Moy. Last updated 11 months ago.
14.8 match 1 stars 2.70 score 1 scriptsbioc
stJoincount:stJoincount - Join count statistic for quantifying spatial correlation between clusters
stJoincount facilitates the application of join count analysis to spatial transcriptomic data generated from the 10x Genomics Visium platform. This tool first converts a labeled spatial tissue map into a raster object, in which each spatial feature is represented by a pixel coded by label assignment. This process includes automatic calculation of optimal raster resolution and extent for the sample. A neighbors list is then created from the rasterized sample, in which adjacent and diagonal neighbors for each pixel are identified. After adding binary spatial weights to the neighbors list, a multi-categorical join count analysis is performed to tabulate "joins" between all possible combinations of label pairs. The function returns the observed join counts, the expected count under conditions of spatial randomness, and the variance calculated under non-free sampling. The z-score is then calculated as the difference between observed and expected counts, divided by the square root of the variance.
Maintained by Jiarong Song. Last updated 5 months ago.
transcriptomicsclusteringspatialbiocviewssoftware
8.5 match 4 stars 4.60 score 3 scriptspoissonconsulting
extras:Helper Functions for Bayesian Analyses
Functions to 'numericise' 'R' objects (coerce to numeric objects), summarise 'MCMC' (Monte Carlo Markov Chain) samples and calculate deviance residuals as well as 'R' translations of some 'BUGS' (Bayesian Using Gibbs Sampling), 'JAGS' (Just Another Gibbs Sampler), 'STAN' and 'TMB' (Template Model Builder) functions.
Maintained by Nicole Hill. Last updated 2 months ago.
4.5 match 9 stars 8.49 score 15 scripts 16 dependentscmerow
meteR:Fitting and Plotting Tools for the Maximum Entropy Theory of Ecology (METE)
Fit and plot macroecological patterns predicted by the Maximum Entropy Theory of Ecology (METE).
Maintained by Cory Merow. Last updated 6 years ago.
7.1 match 11 stars 5.35 score 41 scriptsacare
hacksig:A Tidy Framework to Hack Gene Expression Signatures
A collection of cancer transcriptomics gene signatures as well as a simple and tidy interface to compute single sample enrichment scores either with the original procedure or with three alternatives: the "combined z-score" of Lee et al. (2008) <doi:10.1371/journal.pcbi.1000217>, the "single sample GSEA" of Barbie et al. (2009) <doi:10.1038/nature08460> and the "singscore" of Foroutan et al. (2018) <doi:10.1186/s12859-018-2435-4>. The 'get_sig_info()' function can be used to retrieve information about each signature implemented.
Maintained by Andrea Carenzo. Last updated 2 years ago.
gene-expression-signaturesgene-set-enrichment
6.4 match 19 stars 5.71 score 27 scriptsjeffreyevans
spatialEco:Spatial Analysis and Modelling Utilities
Utilities to support spatial data manipulation, query, sampling and modelling in ecological applications. Functions include models for species population density, spatial smoothing, multivariate separability, point process model for creating pseudo- absences and sub-sampling, Quadrant-based sampling and analysis, auto-logistic modeling, sampling models, cluster optimization, statistical exploratory tools and raster-based metrics.
Maintained by Jeffrey S. Evans. Last updated 13 days ago.
biodiversityconservationecologyr-spatialrasterspatialvector
3.8 match 110 stars 9.55 score 736 scripts 2 dependentsbioc
HERON:Hierarchical Epitope pROtein biNding
HERON is a software package for analyzing peptide binding array data. In addition to identifying significant binding probes, HERON also provides functions for finding epitopes (string of consecutive peptides within a protein). HERON also calculates significance on the probe, epitope, and protein level by employing meta p-value methods. HERON is designed for obtaining calls on the sample level and calculates fractions of hits for different conditions.
Maintained by Sean McIlwain. Last updated 5 months ago.
8.3 match 1 stars 4.18 score 6 scriptsbioc
ScreenR:Package to Perform High Throughput Biological Screening
ScreenR is a package suitable to perform hit identification in loss of function High Throughput Biological Screenings performed using barcoded shRNA-based libraries. ScreenR combines the computing power of software such as edgeR with the simplicity of use of the Tidyverse metapackage. ScreenR executes a pipeline able to find candidate hits from barcode counts, and integrates a wide range of visualization modes for each step of the analysis.
Maintained by Emanuel Michele Soda. Last updated 5 months ago.
softwareassaydomaingeneexpressionhigh-throughput-screening
10.8 match 1 stars 3.11 score 13 scriptsbioc
cn.mops:cn.mops - Mixture of Poissons for CNV detection in NGS data
cn.mops (Copy Number estimation by a Mixture Of PoissonS) is a data processing pipeline for copy number variations and aberrations (CNVs and CNAs) from next generation sequencing (NGS) data. The package supplies functions to convert BAM files into read count matrices or genomic ranges objects, which are the input objects for cn.mops. cn.mops models the depths of coverage across samples at each genomic position. Therefore, it does not suffer from read count biases along chromosomes. Using a Bayesian approach, cn.mops decomposes read variations across samples into integer copy numbers and noise by its mixture components and Poisson distributions, respectively. cn.mops guarantees a low FDR because wrong detections are indicated by high noise and filtered out. cn.mops is very fast and written in C++.
Maintained by Gundula Povysil. Last updated 2 months ago.
sequencingcopynumbervariationhomo_sapienscellbiologyhapmapgeneticscpp
6.2 match 5.35 score 94 scripts 4 dependentsbioc
cmapR:CMap Tools in R
The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.
Maintained by Ted Natoli. Last updated 5 months ago.
dataimportdatarepresentationgeneexpressionbioconductorbioinformaticscmap
3.8 match 89 stars 8.85 score 298 scriptsbioc
NADfinder:Call wide peaks for sequencing data
Nucleolus is an important structure inside the nucleus in eukaryotic cells. It is the site for transcribing rDNA into rRNA and for assembling ribosomes, aka ribosome biogenesis. In addition, nucleoli are dynamic hubs through which numerous proteins shuttle and contact specific non-rDNA genomic loci. Deep sequencing analyses of DNA associated with isolated nucleoli (NAD- seq) have shown that specific loci, termed nucleolus- associated domains (NADs) form frequent three- dimensional associations with nucleoli. NAD-seq has been used to study the biological functions of NAD and the dynamics of NAD distribution during embryonic stem cell (ESC) differentiation. Here, we developed a Bioconductor package NADfinder for bioinformatic analysis of the NAD-seq data, including baseline correction, smoothing, normalization, peak calling, and annotation.
Maintained by Jianhong Ou. Last updated 2 months ago.
sequencingdnaseqgeneregulationpeakdetection
7.8 match 4.18 score 1 scriptsbioc
Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data
The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.
Maintained by Matteo Tiberti. Last updated 2 months ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment
4.9 match 5 stars 6.59 score 43 scriptsbioc
r3Cseq:Analysis of Chromosome Conformation Capture and Next-generation Sequencing (3C-seq)
This package is used for the analysis of long-range chromatin interactions from 3C-seq assay.
Maintained by Supat Thongjuea. Last updated 5 months ago.
6.5 match 3 stars 4.85 score 17 scriptsbioc
SIM:Integrated Analysis on two human genomic datasets
Finds associations between two human genomic datasets.
Maintained by Renee X. de Menezes. Last updated 5 months ago.
7.3 match 4.30 score 3 scriptsandrewljackson
SIBER:Stable Isotope Bayesian Ellipses in R
Fits bi-variate ellipses to stable isotope data using Bayesian inference with the aim being to describe and compare their isotopic niche.
Maintained by Andrew Jackson. Last updated 10 months ago.
community-ecologyecologyniche-modellingstable-isotopesjagscpp
3.3 match 36 stars 9.13 score 187 scripts 1 dependentsoobianom
quickcode:Quick and Essential 'R' Tricks for Better Scripts
The NOT functions, 'R' tricks and a compilation of some simple quick plus often used 'R' codes to improve your scripts. Improve the quality and reproducibility of 'R' scripts.
Maintained by Obinna Obianom. Last updated 14 days ago.
3.8 match 5 stars 7.76 score 7 scripts 6 dependentsworldhealthorganization
anthro:Computation of the WHO Child Growth Standards
Provides WHO Child Growth Standards (z-scores) with confidence intervals and standard errors around the prevalence estimates, taking into account complex sample designs. More information on the methods is available online: <https://www.who.int/tools/child-growth-standards>.
Maintained by Dirk Schumacher. Last updated 5 months ago.
4.5 match 32 stars 6.30 score 26 scripts 2 dependentslkmramba
MUACz:Generate MUAC and BMI z-Scores and Percentiles for Children and Adolescents
Generates mid upper arm circumference (MUAC) and body mass index (BMI) for age z-scores and percentiles based on LMS method for children and adolescents up to 19 years that can be used to assess nutritional and health status and define risk of adverse health events.
Maintained by Lazarus Mramba. Last updated 5 years ago.
28.2 match 1.00 scorebioc
hermes:Preprocessing, analyzing, and reporting of RNA-seq data
Provides classes and functions for quality control, filtering, normalization and differential expression analysis of pre-processed `RNA-seq` data. Data can be imported from `SummarizedExperiment` as well as `matrix` objects and can be annotated from `BioMart`. Filtering for genes without too low expression or containing required annotations, as well as filtering for samples with sufficient correlation to other samples or total number of reads is supported. The standard normalization methods including cpm, rpkm and tpm can be used, and 'DESeq2` as well as voom differential expression analyses are available.
Maintained by Daniel Sabanรฉs Bovรฉ. Last updated 5 months ago.
rnaseqdifferentialexpressionnormalizationpreprocessingqualitycontrolrna-seqstatistical-engineering
3.6 match 11 stars 7.77 score 48 scripts 1 dependentsbioc
xcore:xcore expression regulators inference
xcore is an R package for transcription factor activity modeling based on known molecular signatures and user's gene expression data. Accompanying xcoredata package provides a collection of molecular signatures, constructed from publicly available ChiP-seq experiments. xcore use ridge regression to model changes in expression as a linear combination of molecular signatures and find their unknown activities. Obtained, estimates can be further tested for significance to select molecular signatures with the highest predicted effect on the observed expression changes.
Maintained by Maciej Migdaล. Last updated 5 months ago.
geneexpressiongeneregulationepigeneticsregressionsequencing
6.9 match 4.00 score 8 scriptsjpgard
auctestr:Statistical Testing for AUC Data
Performs statistical testing to compare predictive models based on multiple observations of the A' statistic (also known as Area Under the Receiver Operating Characteristic Curve, or AUC). Specifically, it implements a testing method based on the equivalence between the A' statistic and the Wilcoxon statistic. For more information, see Hanley and McNeil (1982) <doi:10.1148/radiology.143.1.7063747>.
Maintained by Josh Gardner. Last updated 7 years ago.
6.8 match 2 stars 4.00 score 8 scriptsspsanderson
healthyR.ai:The Machine Learning and AI Modeling Companion to 'healthyR'
Hospital machine learning and ai data analysis workflow tools, modeling, and automations. This library provides many useful tools to review common administrative hospital data. Some of these include predicting length of stay, and readmits. The aim is to provide a simple and consistent verb framework that takes the guesswork out of everything.
Maintained by Steven Sanderson. Last updated 2 months ago.
aiartificial-intelligencehealthcareanalyticshealthyrhealthyversemachine-learning
3.6 match 16 stars 7.37 score 36 scripts 1 dependentscumulocity-iot
pmml:Generate PMML for Various Models
The Predictive Model Markup Language (PMML) is an XML-based language which provides a way for applications to define machine learning, statistical and data mining models and to share models between PMML compliant applications. More information about the PMML industry standard and the Data Mining Group can be found at <http://dmg.org/>. The generated PMML can be imported into any PMML consuming application, such as Zementis Predictive Analytics products. The package isofor (used for anomaly detection) can be installed with devtools::install_github("gravesee/isofor").
Maintained by Dmitriy Bolotov. Last updated 3 years ago.
3.3 match 20 stars 7.98 score 560 scripts 1 dependentsrisktoollib
RTL:Risk Tool Library - Trading, Risk, Analytics for Commodities
A toolkit for Commodities 'analytics', risk management and trading professionals. Includes functions for API calls to <https://commodities.morningstar.com/#/>, <https://developer.genscape.com/>, and <https://www.bankofcanada.ca/valet/docs>.
Maintained by Philippe Cote. Last updated 18 days ago.
analyticsapicommoditiescommodities-apifinancegenscapemorningstarpythonrisk-managementcpp
3.5 match 30 stars 7.51 score 198 scriptscwatson
brainGraph:Graph Theory Analysis of Brain MRI Data
A set of tools for performing graph theory analysis of brain MRI data. It works with data from a Freesurfer analysis (cortical thickness, volumes, local gyrification index, surface area), diffusion tensor tractography data (e.g., from FSL) and resting-state fMRI data (e.g., from DPABI). It contains a graphical user interface for graph visualization and data exploration, along with several functions for generating useful figures.
Maintained by Christopher G. Watson. Last updated 1 years ago.
brain-connectivitybrain-imagingcomplex-networksconnectomeconnectomicsfmrigraph-theorymrinetwork-analysisneuroimagingneurosciencestatisticstractography
3.3 match 188 stars 7.86 score 107 scripts 3 dependentsbioc
SpectralTAD:SpectralTAD: Hierarchical TAD detection using spectral clustering
SpectralTAD is an R package designed to identify Topologically Associated Domains (TADs) from Hi-C contact matrices. It uses a modified version of spectral clustering that uses a sliding window to quickly detect TADs. The function works on a range of different formats of contact matrices and returns a bed file of TAD coordinates. The method does not require users to adjust any parameters to work and gives them control over the number of hierarchical levels to be returned.
Maintained by Mikhail Dozmorov. Last updated 5 months ago.
softwarehicsequencingfeatureextractionclustering
4.0 match 8 stars 6.53 score 17 scriptsr-spark
sparklyr.flint:Sparklyr Extension for 'Flint'
This sparklyr extension makes 'Flint' time series library functionalities (<https://github.com/twosigma/flint>) easily accessible through R.
Maintained by Edgar Ruiz. Last updated 3 years ago.
apache-sparkdata-analysisdata-miningdata-sciencedistributeddistributed-computingflintremote-clusterssparksparklyrstatistical-analysisstatisticsstatssummarizationsummary-statisticstime-seriestime-series-analysistwosigma-flint
4.0 match 9 stars 6.46 score 54 scriptsbioc
CoGAPS:Coordinated Gene Activity in Pattern Sets
Coordinated Gene Activity in Pattern Sets (CoGAPS) implements a Bayesian MCMC matrix factorization algorithm, GAPS, and links it to gene set statistic methods to infer biological process activity. It can be used to perform sparse matrix factorization on any data, and when this data represents biomolecules, to do gene set analysis.
Maintained by Elana J. Fertig. Last updated 5 months ago.
geneexpressiontranscriptiongenesetenrichmentdifferentialexpressionbayesianclusteringtimecoursernaseqmicroarraymultiplecomparisondimensionreductionimmunooncologycpp
3.8 match 6.72 score 104 scriptsr4epi
epikit:Miscellaneous Helper Tools for Epidemiologists
Contains tools for formatting inline code, renaming redundant columns, aggregating age categories, adding survey weights, finding the earliest date of an event, plotting z-curves, generating population counts and calculating proportions with confidence intervals. This is part of the 'R4Epis' project <https://r4epis.netlify.app/>.
Maintained by Zhian N. Kamvar. Last updated 1 months ago.
3.9 match 10 stars 6.32 score 22 scripts 2 dependentswencke
GOplot:Visualization of Functional Analysis Data
Implementation of multilayered visualizations for enhanced graphical representation of functional analysis data. It combines and integrates omics data derived from expression and functional annotation enrichment analyses. Its plotting functions have been developed with an hierarchical structure in mind: starting from a general overview to identify the most enriched categories (modified bar plot, bubble plot) to a more detailed one displaying different types of relevant information for the molecules in a given set of categories (circle plot, chord plot, cluster plot, Venn diagram, heatmap).
Maintained by Wencke Walter. Last updated 8 years ago.
3.8 match 20 stars 6.60 score 235 scriptsdpagliaccio
scipub:Summarize Data for Scientific Publication
Create and format tables and APA statistics for scientific publication. This includes making a 'Table 1' to summarize demographics across groups, correlation tables with significance indicated by stars, and extracting formatted statistical summarizes from simple tests for in-text notation. The package also includes functions for Winsorizing data based on a Z-statistic cutoff.
Maintained by David Pagliaccio. Last updated 1 years ago.
7.1 match 2 stars 3.43 score 27 scriptskvasilopoulos
transx:Transform Univariate Time Series
Univariate time series operations that follow an opinionated design. The main principle of 'transx' is to keep the number of observations the same. Operations that reduce this number have to fill the observations gap.
Maintained by Kostas Vasilopoulos. Last updated 4 years ago.
detrendfiltersoutlierstime-seriestransx
5.6 match 3 stars 4.29 score 13 scriptsbioc
tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles
This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp
3.2 match 19 stars 7.26 score 35 scriptsmrcieu
gwasglue2:GWAS summary data sources connected to analytical tools
Description: Many tools exist that use GWAS summary data for colocalisation, fine mapping, Mendelian randomization, visualisation, etc. This package is a conduit that connects R packages that can retrieve GWAS summary data to various tools for analysing those data.
Maintained by Rita Rasteiro. Last updated 1 years ago.
4.0 match 21 stars 5.69 score 11 scripts 2 dependentsdewittpe
pedbp:Pediatric Blood Pressure
Data and utilities for estimating pediatric blood pressure percentiles by sex, age, and optionally height (stature) as described in Martin et.al. (2022) <doi:10.1001/jamanetworkopen.2022.36918>. Blood pressure percentiles for children under one year of age come from Gemelli et.al. (1990) <doi:10.1007/BF02171556>. Estimates of blood pressure percentiles for children at least one year of age are informed by data from the National Heart, Lung, and Blood Institute (NHLBI) and the Centers for Disease Control and Prevention (CDC) <doi:10.1542/peds.2009-2107C> or from Lo et.al. (2013) <doi:10.1542/peds.2012-1292>. The flowchart for selecting the informing data source comes from Martin et.al. (2022) <doi:10.1542/hpeds.2021-005998>.
Maintained by Peter DeWitt. Last updated 2 months ago.
blood-pressuregrowth-standardspediatriccpp
3.5 match 6 stars 6.43 score 45 scriptsjhchou
peditools:Pediatric Clinical Data Science Tools
A collection of tools for newborn and pediatric anthropometric calculations and data abstraction from Vermont Oxford Network registry exports. Includes charts based on Lambda, Mu, Sigma (LMS) parameters, including: Fenton 2003, Olsen 2010, Olsen BMI, CDC infant, CDC pediatric, CDC BMI, CDC (Addo) skin, WHO infant, WHO skin, Abdel-Rahman 2017, Mramba 2017, Zemel Down Syndrome, Brooks cerebral palsy, WHO expanded, Cappa 2024 (except BMI). Includes functions to take a Vermont Oxford Network XML or CSV data file export read into a data frame, converting the coded variables into human readable factors.
Maintained by Joseph Chou. Last updated 2 months ago.
7.5 match 5 stars 3.00 score 2 scriptsemeyers
NeuroDecodeR:Decode Information from Neural Activity
Neural decoding is method of analyzing neural data that uses a pattern classifiers to predict experimental conditions based on neural activity. 'NeuroDecodeR' is a system of objects that makes it easy to run neural decoding analyses. For more information on neural decoding see Meyers & Kreiman (2011) <doi:10.7551/mitpress/8404.003.0024>.
Maintained by Ethan Meyers. Last updated 1 years ago.
3.4 match 12 stars 6.49 score 17 scriptsbioc
DepInfeR:Inferring tumor-specific cancer dependencies through integrating ex-vivo drug response assays and drug-protein profiling
DepInfeR integrates two experimentally accessible input data matrices: the drug sensitivity profiles of cancer cell lines or primary tumors ex-vivo (X), and the drug affinities of a set of proteins (Y), to infer a matrix of molecular protein dependencies of the cancers (ร). DepInfeR deconvolutes the protein inhibition effect on the viability phenotype by using regularized multivariate linear regression. It assigns a โdependence coefficientโ to each protein and each sample, and therefore could be used to gain a causal and accurate understanding of functional consequences of genomic aberrations in a heterogeneous disease, as well as to guide the choice of pharmacological intervention for a specific cancer type, sub-type, or an individual patient. For more information, please read out preprint on bioRxiv: https://doi.org/10.1101/2022.01.11.475864.
Maintained by Junyan Lu. Last updated 5 months ago.
softwareregressionpharmacogeneticspharmacogenomicsfunctionalgenomics
4.7 match 1 stars 4.36 score 23 scriptscran
metaGE:Meta-Analysis for Detecting Genotype x Environment Associations
Provides functions to perform all steps of genome-wide association meta-analysis for studying Genotype x Environment interactions, from collecting the data to the manhattan plot. The procedure accounts for the potential correlation between studies. In addition to the Fixed and Random models, one can investigate the relationship between QTL effects and some qualitative or quantitative covariate via the test of contrast and the meta-regression, respectively. The methodology is available from: (De Walsche, A., et al. (2025) \doi{10.1371/journal.pgen.1011553}).
Maintained by Annaรฏg De Walsche. Last updated 22 days ago.
8.7 match 2.30 score 1 scriptsworldhealthorganization
anthroplus:Computation of the WHO 2007 References for School-Age Children and Adolescents (5 to 19 Years)
Provides WHO 2007 References for School-age Children and Adolescents (5 to 19 years) (z-scores) with confidence intervals and standard errors around the prevalence estimates, taking into account complex sample designs. More information on the methods is available online: <https://www.who.int/tools/growth-reference-data-for-5to19-years>.
Maintained by Dirk Schumacher. Last updated 4 months ago.
4.5 match 4 stars 4.20 score 7 scriptsguangshengpei
deTS:Tissue-Specific Enrichment Analysis
Tissue-specific enrichment analysis to assess lists of candidate genes or RNA-Seq expression profiles. Pei G., Dai Y., Zhao Z. Jia P. (2019) deTS: Tissue-Specific Enrichment Analysis to decode tissue specificity. Bioinformatics, In submission.
Maintained by Guangsheng Pei. Last updated 6 years ago.
10.8 match 1.73 score 18 scriptsbioc
UMI4Cats:UMI4Cats: Processing, analysis and visualization of UMI-4C chromatin contact data
UMI-4C is a technique that allows characterization of 3D chromatin interactions with a bait of interest, taking advantage of a sonication step to produce unique molecular identifiers (UMIs) that help remove duplication bias, thus allowing a better differential comparsion of chromatin interactions between conditions. This package allows processing of UMI-4C data, starting from FastQ files provided by the sequencing facility. It provides two statistical methods for detecting differential contacts and includes a visualization function to plot integrated information from a UMI-4C assay.
Maintained by Mireia Ramos-Rodriguez. Last updated 5 months ago.
qualitycontrolpreprocessingalignmentnormalizationvisualizationsequencingcoveragechromatinchromatin-interactiongenomicsumi4c
3.3 match 5 stars 5.57 score 7 scriptsrezamoammadi
liver:"Eating the Liver of Data Science"
Offers a suite of helper functions to simplify various data science techniques for non-experts. This package aims to enable individuals with only a minimal level of coding knowledge to become acquainted with these techniques in an accessible manner. Inspired by an ancient Persian idiom, we liken this process to "eating the liver of data science," suggesting a deep and intimate engagement with the field of data science. This package includes functions for tasks such as data partitioning for out-of-sample testing, calculating Mean Squared Error (MSE) to assess prediction accuracy, and data transformations (z-score and min-max). In addition to these helper functions, the 'liver' package also features several intriguing datasets valuable for multivariate analysis.
Maintained by Reza Mohammadi. Last updated 4 months ago.
4.6 match 4.00 score 67 scriptslan
MTAR:Multi-Trait Analysis of Rare-Variant Association Study
Perform multi-trait rare-variant association tests using the summary statistics and adjust for possible sample overlap. Package is based on "Multi-Trait Analysis of Rare-Variant Association Summary Statistics using MTAR" by Luo, L., Shen, J., Zhang, H., Chhibber, A. Mehrotra, D.V., Tang, Z., 2019 (submitted).
Maintained by Lan Luo. Last updated 5 years ago.
9.1 match 2.00 score 7 scriptsbioc
BiSeq:Processing and analyzing bisulfite sequencing data
The BiSeq package provides useful classes and functions to handle and analyze targeted bisulfite sequencing (BS) data such as reduced-representation bisulfite sequencing (RRBS) data. In particular, it implements an algorithm to detect differentially methylated regions (DMRs). The package takes already aligned BS data from one or multiple samples.
Maintained by Katja Hebestreit. Last updated 5 months ago.
geneticssequencingmethylseqdnamethylation
3.8 match 4.78 score 30 scriptsrichardli
surveyPrev:Mapping the Prevalence of Binary Indicators using Survey Data in Small Areas
Provides a pipeline to perform small area estimation and prevalence mapping of binary indicators using health and demographic survey data, described in Fuglstad et al. (2022) <doi:10.48550/arXiv.2110.09576> and Wakefield et al. (2020) <doi:10.1111/insr.12400>.
Maintained by Qianyu Dong. Last updated 5 days ago.
3.1 match 1 stars 5.76 score 11 scriptswjschne
WJSmisc:Miscellaneous functions from W. Joel Schneider
Several functions I find useful.
Maintained by W. Joel Schneider. Last updated 2 years ago.
7.3 match 5 stars 2.40 score 10 scriptspaballand
EconGeo:Computing Key Indicators of the Spatial Distribution of Economic Activities
Functions to compute a series of indices commonly used in the fields of economic geography, economic complexity, and evolutionary economics to describe the location, distribution, spatial organization, structure, and complexity of economic activities. Functions include basic spatial indicators such as the location quotient, the Krugman specialization index, the Herfindahl or the Shannon entropy indices but also more advanced functions to compute different forms of normalized relatedness between economic activities or network-based measures of economic complexity. Most of the functions use matrix calculus and are based on bipartite (incidence) matrices consisting of region - industry pairs.
Maintained by Pierre-Alexandre Balland. Last updated 2 years ago.
3.5 match 41 stars 4.96 score 44 scriptsscarpino
multiDimBio:Multivariate Analysis and Visualization for Biological Data
Code to support a systems biology research program from inception through publication. The methods focus on dimension reduction approaches to detect patterns in complex, multivariate experimental data and places an emphasis on informative visualizations. The goal for this project is to create a package that will evolve over time, thereby remaining relevant and reflective of current methods and techniques. As a result, we encourage suggested additions to the package, both methodological and graphical.
Maintained by Samuel V. Scarpino. Last updated 5 years ago.
7.1 match 2.41 score 26 scriptsbioc
cTRAP:Identification of candidate causal perturbations from differential gene expression data
Compare differential gene expression results with those from known cellular perturbations (such as gene knock-down, overexpression or small molecules) derived from the Connectivity Map. Such analyses allow not only to infer the molecular causes of the observed difference in gene expression but also to identify small molecules that could drive or revert specific transcriptomic alterations.
Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.
differentialexpressiongeneexpressionrnaseqtranscriptomicspathwaysimmunooncologygenesetenrichmentbioconductorbioinformaticscmapgene-expressionl1000
3.3 match 5 stars 5.08 score 16 scriptsbioc
miRLAB:Dry lab for exploring miRNA-mRNA relationships
Provide tools exploring miRNA-mRNA relationships, including popular miRNA target prediction methods, ensemble methods that integrate individual methods, functions to get data from online resources, functions to validate the results, and functions to conduct enrichment analyses.
Maintained by Thuc Duy Le. Last updated 5 months ago.
mirnageneexpressionnetworkinferencenetwork
3.5 match 4.72 score 11 scriptsbarakbri
repfdr:Replicability Analysis for Multiple Studies of High Dimension
Estimation of Bayes and local Bayes false discovery rates for replicability analysis (Heller & Yekutieli, 2014 <doi:10.1214/13-AOAS697> ; Heller at al., 2015 <doi: 10.1093/bioinformatics/btu434>).
Maintained by Ruth Heller. Last updated 7 years ago.
3.3 match 3 stars 4.98 score 16 scriptshendersontrent
theftdlc:Analyse and Interpret Time Series Features
Provides a suite of functions for analysing, interpreting, and visualising time-series features calculated from different feature sets from the 'theft' package. Implements statistical learning methodologies described in Henderson, T., Bryant, A., and Fulcher, B. (2023) <arXiv:2303.17809>.
Maintained by Trent Henderson. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningstatisticstime-series
3.3 match 4 stars 4.94 score 11 scriptshendersontrent
normaliseR:Re-Scale Vectors and Time-Series Features
Provides standardized access to a range of re-scaling methods for numerical vectors and time-series features calculated within the 'theft' ecosystem.
Maintained by Trent Henderson. Last updated 1 years ago.
3.6 match 4.48 score 1 dependentsbioc
mirTarRnaSeq:mirTarRnaSeq
mirTarRnaSeq R package can be used for interactive mRNA miRNA sequencing statistical analysis. This package utilizes expression or differential expression mRNA and miRNA sequencing results and performs interactive correlation and various GLMs (Regular GLM, Multivariate GLM, and Interaction GLMs ) analysis between mRNA and miRNA expriments. These experiments can be time point experiments, and or condition expriments.
Maintained by Mercedeh Movassagh. Last updated 5 months ago.
mirnaregressionsoftwaresequencingsmallrnatimecoursedifferentialexpression
3.8 match 4.00 score 9 scriptsbioc
magrene:Motif Analysis In Gene Regulatory Networks
magrene allows the identification and analysis of graph motifs in (duplicated) gene regulatory networks (GRNs), including lambda, V, PPI V, delta, and bifan motifs. GRNs can be tested for motif enrichment by comparing motif frequencies to a null distribution generated from degree-preserving simulated GRNs. Motif frequencies can be analyzed in the context of gene duplications to explore the impact of small-scale and whole-genome duplications on gene regulatory networks. Finally, users can calculate interaction similarity for gene pairs based on the Sorensen-Dice similarity index.
Maintained by Fabrรญcio Almeida-Silva. Last updated 5 months ago.
softwaremotifdiscoverynetworkenrichmentsystemsbiologygraphandnetworkgene-regulatory-networkmotif-analysisnetwork-motifsnetwork-science
3.6 match 1 stars 4.00 score 2 scriptsveronicanava
RamanMP:Analysis and Identification of Raman Spectra of Microplastics
Pre-processing and polymer identification of Raman spectra of plastics. Pre-processing includes normalisation functions, peak identification based on local maxima, smoothing process and removal of spectral region of no interest. Polymer identification can be performed using Pearson correlation coefficient or Euclidean distance (Renner et al. (2019), <doi:10.1016/j.trac.2018.12.004>), and the comparison can be done with a user-defined database or with the database already implemented in the package, which currently includes 356 spectra, with several spectra of plastic colorants.
Maintained by Veronica Nava. Last updated 3 years ago.
4.0 match 6 stars 3.48 score 1 scriptscran
episcan:Scan Pairwise Epistasis
Searching genomic interactions with linear/logistic regression in a high-dimensional dataset is a time-consuming task. This package provides some efficient ways to scan epistasis in genome-wide interaction studies (GWIS). Both case-control status (binary outcome) and quantitative phenotype (continuous outcome) are supported (the main references: 1. Kam-Thong, T., D. Czamara, K. Tsuda, K. Borgwardt, C. M. Lewis, A. Erhardt-Lehmann, B. Hemmer, et al. (2011). <doi:10.1038/ejhg.2010.196>. 2. Kam-Thong, T., B. Pรผtz, N. Karbalai, B. Mรผller-Myhsok, and K. Borgwardt. (2011). <doi:10.1093/bioinformatics/btr218>.)
Maintained by Beibei Jiang. Last updated 7 years ago.
6.8 match 2.00 scorebioc
scDotPlot:Cluster a Single-cell RNA-seq Dot Plot
Dot plots of single-cell RNA-seq data allow for an examination of the relationships between cell groupings (e.g. clusters) and marker gene expression. The scDotPlot package offers a unified approach to perform a hierarchical clustering analysis and add annotations to the columns and/or rows of a scRNA-seq dot plot. It works with SingleCellExperiment and Seurat objects as well as data frames.
Maintained by Benjamin I Laufer. Last updated 5 months ago.
softwarevisualizationdifferentialexpressiongeneexpressiontranscriptionrnaseqsinglecellsequencingclustering
2.7 match 2 stars 4.85 score 2 scriptsleospeidel
twigstats:twigstats
This package takes Relate genealogies as input to compute time-stratified f-statistics.
Maintained by Leo Speidel. Last updated 9 days ago.
3.3 match 13 stars 3.93 score 12 scriptsbioc
ClusterJudge:Judging Quality of Clustering Methods using Mutual Information
ClusterJudge implements the functions, examples and other software published as an algorithm by Gibbons, FD and Roth FP. The article is called "Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation" and it appeared in Genome Research, vol. 12, pp1574-1581 (2002). See package?ClusterJudge for an overview.
Maintained by Adrian Pasculescu. Last updated 2 months ago.
softwarestatisticalmethodclusteringgeneexpressiongo
3.6 match 3.48 score 3 scriptstweedell
motoRneuron:Analyzing Paired Neuron Discharge Times for Time-Domain Synchronization
The temporal relationship between motor neurons can offer explanations for neural strategies. We combined functions to reduce neuron action potential discharge data and analyze it for short-term, time-domain synchronization. Even more so, motoRneuron combines most available methods for the determining cross correlation histogram peaks and most available indices for calculating synchronization into simple functions. See Nordstrom, Fuglevand, and Enoka (1992) <doi:10.1113/jphysiol.1992.sp019244> for a more thorough introduction.
Maintained by Andrew Tweedell. Last updated 6 years ago.
3.3 match 1 stars 3.74 score 11 scriptsmissiegobeats
OutliersLearn:Educational Outlier Package with Common Outlier Detection Algorithms
Provides implementations of some of the most important outlier detection algorithms. Includes a tutorial mode option that shows a description of each algorithm and provides a step-by-step execution explanation of how it identifies outliers from the given data with the specified input parameters. References include the works of Azzedine Boukerche, Lining Zheng, and Omar Alfandi (2020) <doi:10.1145/3381028>, Abir Smiti (2020) <doi:10.1016/j.cosrev.2020.100306>, and Xiaogang Su, Chih-Ling Tsai (2011) <doi:10.1002/widm.19>.
Maintained by Andres Missiego Manjon. Last updated 10 months ago.
2.5 match 1 stars 4.60 score 2 scriptsbioc
metaSeq:Meta-analysis of RNA-Seq count data in multiple studies
The probabilities by one-sided NOISeq are combined by Fisher's method or Stouffer's method
Maintained by Koki Tsuyuzaki. Last updated 5 months ago.
rnaseqdifferentialexpressionsequencingimmunooncology
3.4 match 3.30 score 2 scriptsborishejblum
ludic:Linkage Using Diagnosis Codes
Probabilistic record linkage without direct identifiers using only diagnosis codes. Method is detailed in: Hejblum, Weber, Liao, Palmer, Churchill, Szolovits, Murphy, Kohane & Cai (2019) <doi: 10.1038/sdata.2018.298> ; Zhang, Hejblum, Weber, Palmer, Churchill, Szolovits, Murphy, Liao, Kohane & Cai (2021) <doi: 10.1101/2021.05.02.21256490>.
Maintained by Boris P Hejblum. Last updated 4 years ago.
3.6 match 2.81 score 13 scriptsprobic
tigreBrowserWriter:'tigreBrowser' Database Writer
Write modelling results into a database for 'tigreBrowser', a web-based tool for browsing figures and summary data of independent model fits, such as Gaussian process models fitted for each gene or other genomic element. The browser is available at <https://github.com/PROBIC/tigreBrowser>.
Maintained by Antti Honkela. Last updated 7 years ago.
3.5 match 2.70 score 8 scriptscran
schoRsch:Tools for Analyzing Factorial Experiments
Offers a helping hand to psychologists and other behavioral scientists who routinely deal with experimental data from factorial experiments. It includes several functions to format output from other R functions according to the style guidelines of the APA (American Psychological Association). This formatted output can be copied directly into manuscripts to facilitate data reporting. These features are backed up by a toolkit of several small helper functions, e.g., offering out-of-the-box outlier removal. The package lends its name to Georg "Schorsch" Schuessler, ingenious technician at the Department of Psychology III, University of Wuerzburg. For details on the implemented methods, see Roland Pfister and Markus Janczyk (2016) <doi: 10.20982/tqmp.12.2.p147>.
Maintained by Roland Pfister. Last updated 4 months ago.
3.8 match 2 stars 2.48 score 76 scriptstvedebrink
genogeographer:Methods for Analysing Forensic Ancestry Informative Markers
Evaluates likelihood ratio tests for alleged ancestry. Implements the methods of Tvedebrink et al (2018) <doi:10.1016/j.tpb.2017.12.004>.
Maintained by Torben Tvedebrink. Last updated 5 years ago.
8.9 match 1 stars 1.00 score 6 scriptsarissyntakas
GeneScoreR:Gene Scoring from Count Tables
Provides two methods for automatic calculation of gene scores from gene count tables: the z-score method, which requires a table of samples being scored and a count table with control samples, and the geometric mean method, which does not rely on control samples. The mathematical methods implemented are described by Kim et al. (2018) <doi:10.1089/jir.2017.0127>.
Maintained by Aris Syntakas. Last updated 5 months ago.
8.2 match 1.00 scorepboutros
OutSeekR:Statistical Approach to Outlier Detection in RNA-Seq and Related Data
An approach to outlier detection in RNA-seq and related data based on five statistics. 'OutSeekR' implements an outlier test by comparing the distributions of these statistics in observed data with those of simulated null data.
Maintained by Paul Boutros. Last updated 4 months ago.
4.0 match 2.00 score 2 scriptsamlinz
OTUtable:North Temperate Lakes - Microbial Observatory 16S Time Series Data and Functions
Analyses of OTU tables produced by 16S rRNA gene amplicon sequencing, as well as example data. It contains the data and scripts used in the paper Linz, et al. (2017) "Bacterial community composition and dynamics spanning five years in freshwater bog lakes," <doi: 10.1128/mSphere.00169-17>.
Maintained by Alexandra Linz. Last updated 7 years ago.
3.5 match 2.20 score 53 scriptsandriyprotsak5
UAHDataScienceO:Educational Outlier Detection Algorithms with Step-by-Step Tutorials
Provides implementations of some of the most important outlier detection algorithms. Includes a tutorial mode option that shows a description of each algorithm and provides a step-by-step execution explanation of how it identifies outliers from the given data with the specified input parameters. References include the works of Azzedine Boukerche, Lining Zheng, and Omar Alfandi (2020) <doi:10.1145/3381028>, Abir Smiti (2020) <doi:10.1016/j.cosrev.2020.100306>, and Xiaogang Su, Chih-Ling Tsai (2011) <doi:10.1002/widm.19>.
Maintained by Andriy Protsak Protsak. Last updated 1 months ago.
2.5 match 3.00 scorez0on
MCMC.qpcr:Bayesian Analysis of qRT-PCR Data
Quantitative RT-PCR data are analyzed using generalized linear mixed models based on lognormal-Poisson error distribution, fitted using MCMC. Control genes are not required but can be incorporated as Bayesian priors or, when template abundances correlate with conditions, as trackers of global effects (common to all genes). The package also implements a lognormal model for higher-abundance data and a "classic" model involving multi-gene normalization on a by-sample basis. Several plotting functions are included to extract and visualize results. The detailed tutorial is available here: <https://matzlab.weebly.com/uploads/7/6/2/2/76229469/mcmc.qpcr.tutorial.v1.2.4.pdf>.
Maintained by Mikhail V. Matz. Last updated 5 years ago.
3.3 match 2 stars 1.85 score 35 scriptschristianhuber
smartsnp:Fast Multivariate Analyses of Big Genomic Data
Fast computation of multivariate analyses of small (10s to 100s markers) to big (1000s to 100000s) genotype data. Runs Principal Component Analysis allowing for centering, z-score standardization and scaling for genetic drift, projection of ancient samples to modern genetic space and multivariate tests for differences in group location (Permutation-Based Multivariate Analysis of Variance) and dispersion (Permutation-Based Multivariate Analysis of Dispersion).
Maintained by Christian Huber. Last updated 1 years ago.
1.0 match 7 stars 5.32 score 6 scriptsbioc
bacon:Controlling bias and inflation in association studies using the empirical null distribution
Bacon can be used to remove inflation and bias often observed in epigenome- and transcriptome-wide association studies. To this end bacon constructs an empirical null distribution using a Gibbs Sampling algorithm by fitting a three-component normal mixture on z-scores.
Maintained by Maarten van Iterson. Last updated 5 months ago.
immunooncologystatisticalmethodbayesianregressiongenomewideassociationtranscriptomicsrnaseqmethylationarraybatcheffectmultiplecomparison
1.0 match 5.19 score 97 scriptsantonyborel
Laterality:Functions to Calculate Common Laterality Statistics in Primatology
Calculates and plots Handedness index (HI), absolute HI, mean HI and z-score which are commonly used indexes for the study of hand preference (laterality) in non-human primates.
Maintained by Antony Borel. Last updated 8 months ago.
4.4 match 1.15 score 14 scriptscran
UAHDataScienceO:Educational Outlier Detection Algorithms with Step-by-Step Tutorials
Provides implementations of some of the most important outlier detection algorithms. Includes a tutorial mode option that shows a description of each algorithm and provides a step-by-step execution explanation of how it identifies outliers from the given data with the specified input parameters. References include the works of Azzedine Boukerche, Lining Zheng, and Omar Alfandi (2020) <doi:10.1145/3381028>, Abir Smiti (2020) <doi:10.1016/j.cosrev.2020.100306>, and Xiaogang Su, Chih-Ling Tsai (2011) <doi:10.1002/widm.19>.
Maintained by Andriy Protsak Protsak. Last updated 24 days ago.
2.5 match 2.00 scoremurraymegan
FDRestimation:Estimate, Plot, and Summarize False Discovery Rates
The user can directly compute and display false discovery rates from inputted p-values or z-scores under a variety of assumptions. p.fdr() computes FDRs, adjusted p-values and decision reject vectors from inputted p-values or z-values. get.pi0() estimates the proportion of data that are truly null. plot.p.fdr() plots the FDRs, adjusted p-values, and the raw p-values points against their rejection threshold lines.
Maintained by Megan Murray. Last updated 3 years ago.
1.3 match 6 stars 3.65 score 15 scriptsbrendensm
misuvi:Access the Michigan Substance Use Vulnerability Index (MI-SUVI)
Easily import the MI-SUVI data sets. The user can import data sets with full metrics, percentiles, Z-scores, or rankings. Data is available at both the County and Zip Code Tabulation Area (ZCTA) levels. This package also includes a function to import shape files for easy mapping and a function to access the full technical documentation. All data is sourced from the Michigan Department of Health and Human Services.
Maintained by Brenden Smith. Last updated 1 months ago.
1.0 match 3.40 scorecran
GhostKnockoff:The Knockoff Inference Using Summary Statistics
Functions for multiple knockoff inference using summary statistics, e.g. Z-scores. The knockoff inference is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. This package provides a procedure which performs knockoff inference without ever constructing individual knockoffs (GhostKnockoff). It additionally supports multiple knockoff inference for improved stability and reproducibility. Moreover, it supports meta-analysis of multiple overlapping studies.
Maintained by Zihuai He. Last updated 3 years ago.
2.9 match 1 stars 1.00 scoreyuande
signatureSurvival:Signature Survival Analysis
When multiple Cox proportional hazard models are performed on clinical data (month or year and status) and a set of differential expressions of genes, the results (Hazard risks, z-scores and p-values) can be used to create gene-expression signatures. Weights are calculated using the survival p-values of genes and are utilized to calculate expression values of the signature across the selected genes in all patients in a cohort. A Single or multiple univariate or multivariate Cox proportional hazard survival analyses of the patients in one cohort can be performed by using the gene-expression signature and visualized using our survival plots.
Maintained by Yuan-De Tan. Last updated 2 years ago.
2.9 match 1 stars 1.00 scoregjhunt
rrscale:Robust Re-Scaling to Better Recover Latent Effects in Data
Non-linear transformations of data to better discover latent effects. Applies a sequence of three transformations (1) a Gaussianizing transformation, (2) a Z-score transformation, and (3) an outlier removal transformation. A publication describing the method has the following citation: Gregory J. Hunt, Mark A. Dane, James E. Korkola, Laura M. Heiser & Johann A. Gagnon-Bartsch (2020) "Automatic Transformation and Integration to Improve Visualization and Discovery of Latent Effects in Imaging Data", Journal of Computational and Graphical Statistics, <doi:10.1080/10618600.2020.1741379>.
Maintained by Gregory Hunt. Last updated 5 years ago.
1.0 match 2.30 score 9 scriptsxiaoran831213
dotgen:Gene-Set Analysis via Decorrelation by Orthogonal Transformation
Decorrelates a set of summary statistics (i.e., Z-scores or P-values per SNP) via Decorrelation by Orthogonal Transformation (DOT) approach and performs gene-set analyses by combining transformed statistic values; operations are performed with algorithms that rely only on the association summary results and the linkage disequilibrium (LD). For more details on DOT and its power, see Olga (2020) <doi:10.1371/journal.pcbi.1007819>.
Maintained by Xiaoran Tong. Last updated 4 years ago.
1.0 match 2.00 score 1 scriptswpihongzhang
GFisher:Generalized Fisher's Combination Tests Under Dependence
Accurate and computationally efficient p-value calculation methods for a general family of Fisher type statistics (GFisher). The GFisher covers Fisher's combination, Good's statistic, Lancaster's statistic, weighted Z-score combination, etc. It allows a flexible weighting scheme, as well as an omnibus procedure that automatically adapts proper weights and degrees of freedom to a given data. The new p-value calculation methods are based on novel ideas of moment-ratio matching and joint-distribution approximation. The technical details can be found in Hong Zhang and Zheyang Wu (2020) <arXiv:2003.01286>.
Maintained by Hong Zhang. Last updated 3 years ago.
1.0 match 1.00 score