Showing 200 of total 2285 results (show query)
ngmarchant
comparator:Comparison Functions for Clustering and Record Linkage
Implements functions for comparing strings, sequences and numeric vectors for clustering and record linkage applications. Supported comparison functions include: generalized edit distances for comparing sequences/strings, Monge-Elkan similarity for fuzzy comparison of token sets, and L-p distances for comparing numeric vectors. Where possible, comparison functions are implemented in C/C++ to ensure good performance.
Maintained by Neil Marchant. Last updated 3 years ago.
clusteringdistance-measuresdistance-metricsentity-resolutionrecord-linkagesimilarity-measuresstring-similaritycpp
94.7 match 18 stars 4.63 score 47 scriptscran
compare:Comparing Objects for Differences
Functions to compare a model object to a comparison object. If the objects are not identical, the functions can be instructed to explore various modifications of the objects (e.g., sorting rows, dropping names) to see if the modified versions are identical.
Maintained by Paul Murrell. Last updated 10 years ago.
80.3 match 4.68 score 5 dependentsemmanuelparadis
ape:Analyses of Phylogenetics and Evolution
Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.
Maintained by Emmanuel Paradis. Last updated 1 months ago.
19.4 match 64 stars 17.18 score 13k scripts 601 dependentscollinerickson
comparer:Compare Output and Run Time
Quickly run experiments to compare the run time and output of code blocks. The function mbc() can make fast comparisons of code, and will calculate statistics comparing the resulting outputs. It can be used to compare model fits to the same data or see which function runs faster. The R6 class ffexp$new() runs a function using all possible combinations of selected inputs. This is useful for comparing the effect of different parameter values. It can also run in parallel and automatically save intermediate results, which is very useful for long computations.
Maintained by Collin Erickson. Last updated 5 months ago.
60.6 match 4 stars 5.38 score 20 scriptsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 4 hours ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
15.9 match 559 stars 17.64 score 17k scripts 851 dependentsdavidorme
caper:Comparative Analyses of Phylogenetics and Evolution in R
Functions for performing phylogenetic comparative analyses.
Maintained by David Orme. Last updated 1 years ago.
32.3 match 1 stars 7.41 score 928 scripts 5 dependentsbioc
maftools:Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Maintained by Anand Mayakonda. Last updated 5 months ago.
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
15.1 match 459 stars 14.63 score 948 scripts 18 dependentsgeomorphr
geomorph:Geometric Morphometric Analyses of 2D and 3D Landmark Data
Read, manipulate, and digitize landmark data, generate shape variables via Procrustes analysis for points, curves and surfaces, perform shape analyses, and provide graphical depictions of shapes and patterns of shape variation.
Maintained by Dean Adams. Last updated 1 months ago.
16.1 match 76 stars 12.05 score 700 scripts 6 dependentsamices
mice:Multivariate Imputation by Chained Equations
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
Maintained by Stef van Buuren. Last updated 6 days ago.
chained-equationsfcsimputationmicemissing-datamissing-valuesmultiple-imputationmultivariate-datacpp
11.0 match 462 stars 16.50 score 10k scripts 154 dependentscapitalone
dataCompareR:Compare Two Data Frames and Summarise the Difference
Easy comparison of two tabular data objects in R. Specifically designed to show differences between two sets of data in a useful way that should make it easier to understand the differences, and if necessary, help you work out how to remedy them. Aims to offer a more useful output than all.equal() when your two data sets do not match, but isn't intended to replace all.equal() as a way to test for equality.
Maintained by Sarah Johnston. Last updated 2 years ago.
compare-datadatadata-analysisdata-science
22.5 match 76 stars 7.24 score 76 scriptshfgolino
EGAnet:Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics
Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.
Maintained by Hudson Golino. Last updated 9 days ago.
20.0 match 47 stars 7.80 score 61 scripts 1 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.
Maintained by Robert J. Hijmans. Last updated 2 months ago.
8.8 match 164 stars 17.05 score 58k scripts 555 dependentsbioc
survcomp:Performance Assessment and Comparison for Survival Analysis
Assessment and Comparison for Performance of Risk Prediction (Survival) Models.
Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.
geneexpressiondifferentialexpressionvisualizationcpp
17.7 match 8.46 score 448 scripts 12 dependentstalgalili
dendextend:Extending 'dendrogram' Functionality in R
Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.
Maintained by Tal Galili. Last updated 2 months ago.
8.5 match 154 stars 17.02 score 6.0k scripts 164 dependentsthackl
gggenomes:A Grammar of Graphics for Comparative Genomics
An extension of 'ggplot2' for creating complex genomic maps. It builds on the power of 'ggplot2' and 'tidyverse' adding new 'ggplot2'-style geoms & positions and 'dplyr'-style verbs to manipulate the underlying data. It implements a layout concept inspired by 'ggraph' and introduces tracks to bring tidiness to the mess that is genomics data.
Maintained by Thomas Hackl. Last updated 1 months ago.
biological-datacomparative-genomicsgenomics-visualizationggplot-extensionggplot2
14.8 match 650 stars 9.56 score 123 scriptstidyverse
tibble:Simple Data Frames
Provides a 'tbl_df' class (the 'tibble') with stricter checking and better formatting than the traditional data frame.
Maintained by Kirill Müller. Last updated 3 months ago.
6.1 match 692 stars 22.78 score 47k scripts 11k dependentsbraverock
PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis
Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.
Maintained by Brian G. Peterson. Last updated 3 months ago.
8.4 match 222 stars 15.93 score 4.8k scripts 20 dependentskingaa
ouch:Ornstein-Uhlenbeck Models for Phylogenetic Comparative Hypotheses
Fit and compare Ornstein-Uhlenbeck models for evolution along a phylogenetic tree.
Maintained by Aaron A. King. Last updated 4 months ago.
adaptive-regimebrownian-motionornstein-uhlenbeckornstein-uhlenbeck-modelsouchphylogenetic-comparative-hypothesesphylogenetic-comparative-methodsphylogenetic-datareact
18.6 match 15 stars 6.87 score 68 scripts 4 dependentsstan-dev
loo:Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models
Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.
Maintained by Jonah Gabry. Last updated 2 days ago.
bayesbayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticscross-validationinformation-criterionmodel-comparisonstan
7.2 match 152 stars 17.30 score 2.6k scripts 297 dependentswalkerke
mapgl:Interactive Maps with 'Mapbox GL JS' and 'MapLibre GL JS'
Provides an interface to the 'Mapbox GL JS' (<https://docs.mapbox.com/mapbox-gl-js/guides>) and the 'MapLibre GL JS' (<https://maplibre.org/maplibre-gl-js/docs/>) interactive mapping libraries to help users create custom interactive maps in R. Users can create interactive globe visualizations; layer 'sf' objects to create filled maps, circle maps, 'heatmaps', and three-dimensional graphics; and customize map styles and views. The package also includes utilities to use 'Mapbox' and 'MapLibre' maps in 'Shiny' web applications.
Maintained by Kyle Walker. Last updated 10 hours ago.
15.2 match 114 stars 8.06 score 138 scriptsskranz
RTutor:Interactive R problem sets with automatic testing of solutions and automatic hints
Interactive R problem sets with automatic testing of solutions and automatic hints
Maintained by Sebastian Kranz. Last updated 1 years ago.
economicslearn-to-codeproblem-setrstudiortutorshinyteaching
19.6 match 205 stars 5.83 score 111 scripts 1 dependentspaternogbc
sensiPhy:Sensitivity Analysis for Comparative Methods
An implementation of sensitivity analysis for phylogenetic comparative methods. The package is an umbrella of statistical and graphical methods that estimate and report different types of uncertainty in PCM: (i) Species Sampling uncertainty (sample size; influential species and clades). (ii) Phylogenetic uncertainty (different topologies and/or branch lengths). (iii) Data uncertainty (intraspecific variation and measurement error).
Maintained by Gustavo Paterno. Last updated 5 years ago.
comparative-methodsecologyevolutionphylogeneticssensitivity-analysis
17.6 match 13 stars 6.38 score 61 scriptscran
nlme:Linear and Nonlinear Mixed Effects Models
Fit and compare Gaussian linear and nonlinear mixed-effects models.
Maintained by R Core Team. Last updated 2 months ago.
8.6 match 6 stars 13.00 score 13k scripts 8.7k dependentsax3man
phylopath:Perform Phylogenetic Path Analysis
A comprehensive and easy to use R implementation of confirmatory phylogenetic path analysis as described by Von Hardenberg and Gonzalez-Voyer (2012) <doi:10.1111/j.1558-5646.2012.01790.x>.
Maintained by Wouter van der Bijl. Last updated 6 months ago.
analysiscomparative-methodspathphylogenetics
13.7 match 13 stars 8.10 score 81 scripts 1 dependentsms609
TreeDist:Calculate and Map Distances Between Phylogenetic Trees
Implements measures of tree similarity, including information-based generalized Robinson-Foulds distances (Phylogenetic Information Distance, Clustering Information Distance, Matching Split Information Distance; Smith 2020) <doi:10.1093/bioinformatics/btaa614>; Jaccard-Robinson-Foulds distances (Bocker et al. 2013) <doi:10.1007/978-3-642-40453-5_13>, including the Nye et al. (2006) metric <doi:10.1093/bioinformatics/bti720>; the Matching Split Distance (Bogdanowicz & Giaro 2012) <doi:10.1109/TCBB.2011.48>; Maximum Agreement Subtree distances; the Kendall-Colijn (2016) distance <doi:10.1093/molbev/msw124>, and the Nearest Neighbour Interchange (NNI) distance, approximated per Li et al. (1996) <doi:10.1007/3-540-61332-3_168>. Includes tools for visualizing mappings of tree space (Smith 2022) <doi:10.1093/sysbio/syab100>, for identifying islands of trees (Silva and Wilkinson 2021) <doi:10.1093/sysbio/syab015>, for calculating the median of sets of trees, and for computing the information content of trees and splits.
Maintained by Martin R. Smith. Last updated 1 months ago.
phylogeneticstree-distancephylogenetic-treestree-distancestreescpp
10.3 match 32 stars 10.32 score 97 scripts 5 dependentslem-usp
evolqg:Evolutionary Quantitative Genetics
Provides functions for covariance matrix comparisons, estimation of repeatabilities in measurements and matrices, and general evolutionary quantitative genetics tools. Melo D, Garcia G, Hubbe A, Assis A P, Marroig G. (2016) <doi:10.12688/f1000research.7082.3>.
Maintained by Diogo Melo. Last updated 11 months ago.
16.7 match 10 stars 6.26 score 114 scriptsmjskay
tidybayes:Tidy Data and 'Geoms' for Bayesian Models
Compose data for and extract, manipulate, and visualize posterior draws from Bayesian models ('JAGS', 'Stan', 'rstanarm', 'brms', 'MCMCglmm', 'coda', ...) in a tidy data format. Functions are provided to help extract tidy data frames of draws from Bayesian models and that generate point summaries and intervals in a tidy format. In addition, 'ggplot2' 'geoms' and 'stats' are provided for common visualization primitives like points with multiple uncertainty intervals, eye plots (intervals plus densities), and fit curves with multiple, arbitrary uncertainty bands.
Maintained by Matthew Kay. Last updated 6 months ago.
bayesian-data-analysisbrmsggplot2jagsstantidy-datavisualization
7.0 match 732 stars 14.88 score 7.3k scripts 19 dependentsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 2 days ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
4.8 match 581 stars 21.10 score 31k scripts 1.9k dependentsbioc
SparseArray:High-performance sparse data representation and manipulation in R
The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.
Maintained by Hervé Pagès. Last updated 23 days ago.
infrastructuredatarepresentationbioconductor-packagecore-packageopenmp
7.9 match 8 stars 12.68 score 79 scripts 1.2k dependentskbroman
qtl:Tools for Analyzing QTL Experiments
Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>.
Maintained by Karl W Broman. Last updated 7 months ago.
7.5 match 80 stars 12.79 score 2.4k scripts 29 dependentsandrewljackson
SIBER:Stable Isotope Bayesian Ellipses in R
Fits bi-variate ellipses to stable isotope data using Bayesian inference with the aim being to describe and compare their isotopic niche.
Maintained by Andrew Jackson. Last updated 10 months ago.
community-ecologyecologyniche-modellingstable-isotopesjagscpp
10.5 match 36 stars 9.13 score 187 scripts 1 dependentsbioc
YAPSA:Yet Another Package for Signature Analysis
This package provides functions and routines for supervised analyses of mutational signatures (i.e., the signatures have to be known, cf. L. Alexandrov et al., Nature 2013 and L. Alexandrov et al., Bioaxiv 2018). In particular, the family of functions LCD (LCD = linear combination decomposition) can use optimal signature-specific cutoffs which takes care of different detectability of the different signatures. Moreover, the package provides different sets of mutational signatures, including the COSMIC and PCAWG SNV signatures and the PCAWG Indel signatures; the latter infering that with YAPSA, the concept of supervised analysis of mutational signatures is extended to Indel signatures. YAPSA also provides confidence intervals as computed by profile likelihoods and can perform signature analysis on a stratified mutational catalogue (SMC = stratify mutational catalogue) in order to analyze enrichment and depletion patterns for the signatures in different strata.
Maintained by Zuguang Gu. Last updated 5 months ago.
sequencingdnaseqsomaticmutationvisualizationclusteringgenomicvariationstatisticalmethodbiologicalquestion
14.8 match 6.41 score 57 scriptsbioc
cogeqc:Systematic quality checks on comparative genomics analyses
cogeqc aims to facilitate systematic quality checks on standard comparative genomics analyses to help researchers detect issues and select the most suitable parameters for each data set. cogeqc can be used to asses: i. genome assembly and annotation quality with BUSCOs and comparisons of statistics with publicly available genomes on the NCBI; ii. orthogroup inference using a protein domain-based approach and; iii. synteny detection using synteny network properties. There are also data visualization functions to explore QC summary statistics.
Maintained by Fabrício Almeida-Silva. Last updated 5 months ago.
softwaregenomeassemblycomparativegenomicsfunctionalgenomicsphylogeneticsqualitycontrolnetworkcomparative-genomicsevolutionary-genomics
15.6 match 10 stars 6.08 score 20 scriptsgefeizhang
statVisual:Statistical Visualization Tools
Visualization functions in the applications of translational medicine (TM) and biomarker (BM) development to compare groups by statistically visualizing data and/or results of analyses, such as visualizing data by displaying in one figure different groups' histograms, boxplots, densities, scatter plots, error-bar plots, or trajectory plots, by displaying scatter plots of top principal components or dendrograms with data points colored based on group information, or visualizing volcano plots to check the results of whole genome analyses for gene differential expression.
Maintained by Wenfei Zhang. Last updated 5 years ago.
31.5 match 3.00 score 3 scriptsinsightsengineering
rtables:Reporting Tables
Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.
Maintained by Joe Zhu. Last updated 2 months ago.
6.7 match 232 stars 13.65 score 238 scripts 17 dependentsdonaldrwilliams
BGGM:Bayesian Gaussian Graphical Models
Fit Bayesian Gaussian graphical models. The methods are separated into two Bayesian approaches for inference: hypothesis testing and estimation. There are extensions for confirmatory hypothesis testing, comparing Gaussian graphical models, and node wise predictability. These methods were recently introduced in the Gaussian graphical model literature, including Williams (2019) <doi:10.31234/osf.io/x8dpr>, Williams and Mulder (2019) <doi:10.31234/osf.io/ypxd8>, Williams, Rast, Pericchi, and Mulder (2019) <doi:10.31234/osf.io/yt386>.
Maintained by Philippe Rast. Last updated 3 months ago.
bayes-factorsbayesian-hypothesis-testinggaussian-graphical-modelsopenblascppopenmp
9.4 match 55 stars 9.64 score 102 scripts 1 dependentsbiodiverse
ubms:Bayesian Models for Data from Unmarked Animals using 'Stan'
Fit Bayesian hierarchical models of animal abundance and occurrence via the 'rstan' package, the R interface to the 'Stan' C++ library. Supported models include single-season occupancy, dynamic occupancy, and N-mixture abundance models. Covariates on model parameters are specified using a formula-based interface similar to package 'unmarked', while also allowing for estimation of random slope and intercept terms. References: Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>; Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 17 days ago.
distance-samplinghierarchical-modelsn-mixture-modeloccupancystanopenblascpp
11.3 match 35 stars 7.88 score 73 scriptsricharddmorey
BayesFactor:Computation of Bayes Factors for Common Designs
A suite of functions for computing various Bayes factors for simple designs, including contingency tables, one- and two-sample designs, one-way designs, general ANOVA designs, and linear regression.
Maintained by Richard D. Morey. Last updated 1 years ago.
6.5 match 133 stars 13.70 score 1.7k scripts 21 dependentsbioc
ComplexHeatmap:Make Complex Heatmaps
Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential patterns. Here the ComplexHeatmap package provides a highly flexible way to arrange multiple heatmaps and supports various annotation graphics.
Maintained by Zuguang Gu. Last updated 5 months ago.
softwarevisualizationsequencingclusteringcomplex-heatmapsheatmap
5.2 match 1.3k stars 16.93 score 16k scripts 151 dependentsbioc
musicatk:Mutational Signature Comprehensive Analysis Toolkit
Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.
Maintained by Joshua D. Campbell. Last updated 5 months ago.
softwarebiologicalquestionsomaticmutationvariantannotation
12.4 match 13 stars 7.02 score 20 scriptskajlinko
testCompareR:Comparing Two Diagnostic Tests with Dichotomous Results using Paired Data
Provides a method for comparing the results of two binary diagnostic tests using paired data. Users can rapidly perform descriptive and inferential statistics in a single function call. Options permit users to select which parameters they are interested in comparing and methods for correction for multiple comparisons. Confidence intervals are calculated using the methods with the best coverage. Hypothesis tests use the methods with the best asymptotic performance. A summary of the methods is available in Roldán-Nofuentes (2020) <doi:10.1186/s12874-020-00988-y>. This package is targeted at clinical researchers who want to rapidly and effectively compare results from binary diagnostic tests.
Maintained by Kyle J. Wilson. Last updated 4 months ago.
19.8 match 4.30 score 4 scriptsbioc
variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments
Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.
Maintained by Gabriel E. Hoffman. Last updated 2 months ago.
rnaseqgeneexpressiongenesetenrichmentdifferentialexpressionbatcheffectqualitycontrolregressionepigeneticsfunctionalgenomicstranscriptomicsnormalizationpreprocessingmicroarrayimmunooncologysoftware
7.2 match 7 stars 11.69 score 1.1k scripts 3 dependentsjoe-chelladurai
uxr:User Experience Research
Provides convenience functions for user experience research with an emphasis on quantitative user experience testing and reporting. The functions are designed to translate statistical approaches to applied user experience research.
Maintained by Joe Chelladurai. Last updated 2 years ago.
quantitativestatisticsux-research
22.7 match 1 stars 3.70 score 10 scriptsbioc
clustifyr:Classifier for Single-cell RNA-seq Using Cell Clusters
Package designed to aid in classifying cells from single-cell RNA sequencing data using external reference data (e.g., bulk RNA-seq, scRNA-seq, microarray, gene lists). A variety of correlation based methods and gene list enrichment methods are provided to assist cell type assignment.
Maintained by Rui Fu. Last updated 5 months ago.
singlecellannotationsequencingmicroarraygeneexpressionassign-identitiesclustersmarker-genesrna-seqsingle-cell-rna-seq
8.5 match 119 stars 9.63 score 296 scriptsdgbonett
vcmeta:Varying Coefficient Meta-Analysis
Implements functions for varying coefficient meta-analysis methods. These methods do not assume effect size homogeneity. Subgroup effect size comparisons, general linear effect size contrasts, and linear models of effect sizes based on varying coefficient methods can be used to describe effect size heterogeneity. Varying coefficient meta-analysis methods do not require the unrealistic assumptions of the traditional fixed-effect and random-effects meta-analysis methods. For details see: Statistical Methods for Psychologists, Volume 5, <https://dgbonett.sites.ucsc.edu/>.
Maintained by Douglas G. Bonett. Last updated 8 months ago.
27.1 match 1 stars 3.00 score 8 scriptspecanproject
PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.
Maintained by David LeBauer. Last updated 1 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
6.7 match 216 stars 11.59 score 64 scripts 14 dependentsjiscah
sequoia:Pedigree Inference from SNPs
Multi-generational pedigree inference from incomplete data on hundreds of SNPs, including parentage assignment and sibship clustering. See Huisman (2017) (<DOI:10.1111/1755-0998.12665>) for more information.
Maintained by Jisca Huisman. Last updated 9 months ago.
pedigreepedigree-reconstructionpedigreessequoiasnpsnp-datafortran
10.5 match 26 stars 7.40 score 79 scriptsjclavel
mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data
Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.
Maintained by Julien Clavel. Last updated 1 months ago.
8.2 match 17 stars 9.46 score 189 scripts 3 dependentsbioc
SPIA:Signaling Pathway Impact Analysis (SPIA) using combined evidence of pathway over-representation and unusual signaling perturbations
This package implements the Signaling Pathway Impact Analysis (SPIA) which uses the information form a list of differentially expressed genes and their log fold changes together with signaling pathways topology, in order to identify the pathways most relevant to the condition under the study.
Maintained by Adi Laurentiu Tarca. Last updated 2 months ago.
11.7 match 6.62 score 113 scripts 4 dependentsr-lib
waldo:Find Differences Between R Objects
Compare complex R objects and reveal the key differences. Designed particularly for use in testing packages where being able to quickly isolate key differences makes understanding test failures much easier.
Maintained by Hadley Wickham. Last updated 4 months ago.
5.5 match 291 stars 13.95 score 143 scripts 480 dependentscristianetaniguti
onemap:Construction of Genetic Maps in Experimental Crosses
Analysis of molecular marker data from model (backcrosses, F2 and recombinant inbred lines) and non-model systems (i. e. outcrossing species). For the later, it allows statistical analysis by simultaneously estimating linkage and linkage phases (genetic map construction) according to Wu et al. (2002) <doi:10.1006/tpbi.2002.1577>. All analysis are based on multipoint approaches using hidden Markov models.
Maintained by Cristiane Taniguti. Last updated 2 months ago.
11.5 match 3 stars 6.58 score 183 scriptsvandomed
tab:Create Summary Tables for Statistical Reports
Contains functions for creating various types of summary tables, e.g. comparing characteristics across levels of a categorical variable and summarizing fitted generalized linear models, generalized estimating equations, and Cox proportional hazards models. Functions are available to handle data from simple random samples as well as complex surveys.
Maintained by Dane R. Van Domelen. Last updated 4 years ago.
manuscriptsreportsreproducible-researchstatisticstables
10.8 match 2 stars 6.97 score 86 scripts 9 dependentsbioc
cola:A Framework for Consensus Partitioning
Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.
Maintained by Zuguang Gu. Last updated 1 months ago.
clusteringgeneexpressionclassificationsoftwareconsensus-clusteringcpp
9.9 match 61 stars 7.49 score 112 scriptsalexsanjoseph
compareDF:Do a Git Style Diff of the Rows Between Two Dataframes with Similar Structure
Compares two dataframes which have the same column structure to show the rows that have changed. Also gives a git style diff format to quickly see what has changed in addition to summary statistics.
Maintained by Alex Joseph. Last updated 1 years ago.
10.0 match 93 stars 7.30 score 119 scripts 2 dependentswinvector
WVPlots:Common Plots for Analysis
Select data analysis plots, under a standardized calling interface implemented on top of 'ggplot2' and 'plotly'. Plots of interest include: 'ROC', gain curve, scatter plot with marginal distributions, conditioned scatter plot with marginal densities, box and stem with matching theoretical distribution, and density with matching theoretical distribution.
Maintained by John Mount. Last updated 11 months ago.
9.0 match 85 stars 8.00 score 280 scriptsbioc
sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
Maintained by Wanding Zhou. Last updated 2 months ago.
dnamethylationmethylationarraypreprocessingqualitycontrolbioinformaticsdna-methylationmicroarray
7.9 match 69 stars 9.08 score 258 scripts 1 dependentsbioc
Biobase:Biobase: Base functions for Bioconductor
Functions that are needed by many other packages or which replace R functions.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructurebioconductor-packagecore-package
4.3 match 9 stars 16.45 score 6.6k scripts 1.8k dependentsmarce10
warbleR:Streamline Bioacoustic Analysis
Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
animal-acoustic-signalsaudio-processingbioacousticsspectrogramstreamline-analysiscpp
6.5 match 54 stars 11.01 score 270 scripts 4 dependentsframverse
framrsquared:FRAM Database Interface
A convenient tool for interfacing with FRAM access databases in R environments.
Maintained by Ty Garber. Last updated 2 months ago.
14.0 match 6 stars 5.06 score 9 scriptsdata-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 11 days ago.
5.7 match 418 stars 12.50 score 448 scripts 9 dependentslukejharmon
geiger:Analysis of Evolutionary Diversification
Methods for fitting macroevolutionary models to phylogenetic trees Pennell (2014) <doi:10.1093/bioinformatics/btu181>.
Maintained by Luke Harmon. Last updated 2 years ago.
9.0 match 1 stars 7.84 score 2.3k scripts 28 dependentsjhelvy
logitr:Logit Models w/Preference & WTP Space Utility Parameterizations
Fast estimation of multinomial (MNL) and mixed logit (MXL) models in R. Models can be estimated using "Preference" space or "Willingness-to-pay" (WTP) space utility parameterizations. Weighted models can also be estimated. An option is available to run a parallelized multistart optimization loop with random starting points in each iteration, which is useful for non-convex problems like MXL models or models with WTP space utility parameterizations. The main optimization loop uses the 'nloptr' package to minimize the negative log-likelihood function. Additional functions are available for computing and comparing WTP from both preference space and WTP space models and for predicting expected choices and choice probabilities for sets of alternatives based on an estimated model. Mixed logit models can include uncorrelated or correlated heterogeneity covariances and are estimated using maximum simulated likelihood based on the algorithms in Train (2009) <doi:10.1017/CBO9780511805271>. More details can be found in Helveston (2023) <doi:10.18637/jss.v105.i10>.
Maintained by John Helveston. Last updated 4 months ago.
log-likelihoodlogitlogit-modelmixed-logitmlogitmultinomial-regressionmxlmxl-modelspreference-spacepreferenceswillingness-to-paywtp
7.8 match 54 stars 9.10 score 119 scripts 1 dependentsrqtl
qtl2:Quantitative Trait Locus Mapping in Experimental Crosses
Provides a set of tools to perform quantitative trait locus (QTL) analysis in experimental crosses. It is a reimplementation of the 'R/qtl' package to better handle high-dimensional data and complex cross designs. Broman et al. (2019) <doi:10.1534/genetics.118.301595>.
Maintained by Karl W Broman. Last updated 8 days ago.
7.4 match 34 stars 9.48 score 1.1k scripts 5 dependentsbluefoxr
COINr:Composite Indicator Construction and Analysis
A comprehensive high-level package, for composite indicator construction and analysis. It is a "development environment" for composite indicators and scoreboards, which includes utilities for construction (indicator selection, denomination, imputation, data treatment, normalisation, weighting and aggregation) and analysis (multivariate analysis, correlation plotting, short cuts for principal component analysis, global sensitivity analysis, and more). A composite indicator is completely encapsulated inside a single hierarchical list called a "coin". This allows a fast and efficient work flow, as well as making quick copies, testing methodological variations and making comparisons. It also includes many plotting options, both statistical (scatter plots, distribution plots) as well as for presenting results.
Maintained by William Becker. Last updated 2 months ago.
7.8 match 26 stars 9.07 score 73 scripts 1 dependentstarnduong
ks:Kernel Smoothing
Kernel smoothers for univariate and multivariate data, with comprehensive visualisation and bandwidth selection capabilities, including for densities, density derivatives, cumulative distributions, clustering, classification, density ridges, significant modal regions, and two-sample hypothesis tests. Chacon & Duong (2018) <doi:10.1201/9780429485572>.
Maintained by Tarn Duong. Last updated 6 months ago.
6.9 match 6 stars 10.14 score 920 scripts 262 dependentsarcaldwell49
TOSTER:Two One-Sided Tests (TOST) Equivalence Testing
Two one-sided tests (TOST) procedure to test equivalence for t-tests, correlations, differences between proportions, and meta-analyses, including power analysis for t-tests and correlations. Allows you to specify equivalence bounds in raw scale units or in terms of effect sizes. See: Lakens (2017) <doi:10.1177/1948550617697177>.
Maintained by Aaron Caldwell. Last updated 1 months ago.
10.3 match 6.77 score 266 scriptsgbradburd
conStruct:Models Spatially Continuous and Discrete Population Genetic Structure
A method for modeling genetic data as a combination of discrete layers, within each of which relatedness may decay continuously with geographic distance. This package contains code for running analyses (which are implemented in the modeling language 'rstan') and visualizing and interpreting output. See the paper for more details on the model and its utility.
Maintained by Gideon Bradburd. Last updated 1 years ago.
8.3 match 35 stars 8.39 score 70 scriptsgagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 1 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
3.8 match 309 stars 18.31 score 10k scripts 8.6k dependentsr-lib
cli:Helpers for Developing Command Line Interfaces
A suite of tools to build attractive command line interfaces ('CLIs'), from semantic elements: headings, lists, alerts, paragraphs, etc. Supports custom themes via a 'CSS'-like language. It also contains a number of lower level 'CLI' elements: rules, boxes, trees, and 'Unicode' symbols with 'ASCII' alternatives. It support ANSI colors and text styles as well.
Maintained by Gábor Csárdi. Last updated 17 hours ago.
3.5 match 664 stars 19.33 score 1.4k scripts 14k dependentsdwarton
ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)
Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.
Maintained by David Warton. Last updated 1 years ago.
10.4 match 8 stars 6.58 score 53 scriptsluisdva
unheadr:Handle Data with Messy Header Rows and Broken Values
Verb-like functions to work with messy data, often derived from spreadsheets or parsed PDF tables. Includes functions for unwrapping values broken up across rows, relocating embedded grouping values, and to annotate meaningful formatting in spreadsheet files.
Maintained by Luis D. Verde Arregoitia. Last updated 10 months ago.
10.6 match 61 stars 6.44 score 45 scriptsadeverse
adephylo:Exploratory Analyses for the Phylogenetic Comparative Method
Multivariate tools to analyze comparative data, i.e. a phylogeny and some traits measured for each taxa. The package contains functions to represent comparative data, compute phylogenetic proximities, perform multivariate analysis with phylogenetic constraints and test for the presence of phylogenetic autocorrelation. The package is described in Jombart et al (2010) <doi:10.1093/bioinformatics/btq292>.
Maintained by Aurélie Siberchicot. Last updated 2 days ago.
6.7 match 9 stars 10.05 score 312 scripts 4 dependentsyulab-smu
scholar:Analyse Citation Data from Google Scholar
Provides functions to extract citation data from Google Scholar. Convenience functions are also provided for comparing multiple scholars and predicting future h-index values.
Maintained by Guangchuang Yu. Last updated 1 years ago.
7.0 match 43 stars 9.63 score 468 scripts 3 dependentstushiqi
MAnorm2:Tools for Normalizing and Comparing ChIP-seq Samples
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the premier technology for profiling genome-wide localization of chromatin-binding proteins, including transcription factors and histones with various modifications. This package provides a robust method for normalizing ChIP-seq signals across individual samples or groups of samples. It also designs a self-contained system of statistical models for calling differential ChIP-seq signals between two or more biological conditions as well as for calling hypervariable ChIP-seq signals across samples. Refer to Tu et al. (2021) <doi:10.1101/gr.262675.120> and Chen et al. (2022) <doi:10.1186/s13059-022-02627-9> for associated statistical details.
Maintained by Shiqi Tu. Last updated 2 years ago.
chip-seqdifferential-analysisempirical-bayeswinsorize-values
12.1 match 32 stars 5.48 score 19 scriptsdeboerk
cocor:Comparing Correlations
Statistical tests for the comparison between two correlations based on either independent or dependent groups. Dependent correlations can either be overlapping or nonoverlapping. A web interface is available on the website <http://comparingcorrelations.org>. A plugin for the R GUI and IDE RKWard is included. Please install RKWard from <https://rkward.kde.org> to use this feature. The respective R package 'rkward' cannot be installed directly from a repository, as it is a part of RKWard.
Maintained by Birk Diedenhofen. Last updated 3 years ago.
12.9 match 1 stars 5.15 score 151 scripts 9 dependentsbioc
PDATK:Pancreatic Ductal Adenocarcinoma Tool-Kit
Pancreatic ductal adenocarcinoma (PDA) has a relatively poor prognosis and is one of the most lethal cancers. Molecular classification of gene expression profiles holds the potential to identify meaningful subtypes which can inform therapeutic strategy in the clinical setting. The Pancreatic Cancer Adenocarcinoma Tool-Kit (PDATK) provides an S4 class-based interface for performing unsupervised subtype discovery, cross-cohort meta-clustering, gene-expression-based classification, and subsequent survival analysis to identify prognostically useful subtypes in pancreatic cancer and beyond. Two novel methods, Consensus Subtypes in Pancreatic Cancer (CSPC) and Pancreatic Cancer Overall Survival Predictor (PCOSP) are included for consensus-based meta-clustering and overall-survival prediction, respectively. Additionally, four published subtype classifiers and three published prognostic gene signatures are included to allow users to easily recreate published results, apply existing classifiers to new data, and benchmark the relative performance of new methods. The use of existing Bioconductor classes as input to all PDATK classes and methods enables integration with existing Bioconductor datasets, including the 21 pancreatic cancer patient cohorts available in the MetaGxPancreas data package. PDATK has been used to replicate results from Sandhu et al (2019) [https://doi.org/10.1200/cci.18.00102] and an additional paper is in the works using CSPC to validate subtypes from the included published classifiers, both of which use the data available in MetaGxPancreas. The inclusion of subtype centroids and prognostic gene signatures from these and other publications will enable researchers and clinicians to classify novel patient gene expression data, allowing the direct clinical application of the classifiers included in PDATK. Overall, PDATK provides a rich set of tools to identify and validate useful prognostic and molecular subtypes based on gene-expression data, benchmark new classifiers against existing ones, and apply discovered classifiers on novel patient data to inform clinical decision making.
Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.
geneexpressionpharmacogeneticspharmacogenomicssoftwareclassificationsurvivalclusteringgeneprediction
15.3 match 1 stars 4.31 score 17 scriptskornl
mutoss:Unified Multiple Testing Procedures
Designed to ease the application and comparison of multiple hypothesis testing procedures for FWER, gFWER, FDR and FDX. Methods are standardized and usable by the accompanying 'mutossGUI'.
Maintained by Kornelius Rohmeyer. Last updated 12 months ago.
7.8 match 4 stars 8.44 score 24 scripts 16 dependentsropensci
rotl:Interface to the 'Open Tree of Life' API
An interface to the 'Open Tree of Life' API to retrieve phylogenetic trees, information about studies used to assemble the synthetic tree, and utilities to match taxonomic names to 'Open Tree identifiers'. The 'Open Tree of Life' aims at assembling a comprehensive phylogenetic tree for all named species.
Maintained by Francois Michonneau. Last updated 2 years ago.
metadataropensciphylogeneticsindependant-contrastsbiodiversitypeer-reviewedphylogenytaxonomy
5.5 match 40 stars 12.05 score 356 scripts 29 dependentsjm-umn
distfreereg:Distribution-Free Goodness-of-Fit Testing for Regression
Implements distribution-free goodness-of-fit regression testing for the mean structure of parametric models introduced in Khmaladze (2021) <doi:10.1007/s10463-021-00786-3>.
Maintained by Jesse Miller. Last updated 4 months ago.
15.3 match 4.25 score 178 scriptsrcalinjageman
esci:Estimation Statistics with Confidence Intervals
A collection of functions and 'jamovi' module for the estimation approach to inferential statistics, the approach which emphasizes effect sizes, interval estimates, and meta-analysis. Nearly all functions are based on 'statpsych' and 'metafor'. This package is still under active development, and breaking changes are likely, especially with the plot and hypothesis test functions. Data sets are included for all examples from Cumming & Calin-Jageman (2024) <ISBN:9780367531508>.
Maintained by Robert Calin-Jageman. Last updated 21 days ago.
jamovijaspsciencestatisticsvisualization
11.7 match 22 stars 5.42 score 12 scriptsovvo-financial
NNS:Nonlinear Nonparametric Statistics
Nonlinear nonparametric statistics using partial moments. Partial moments are the elements of variance and asymptotically approximate the area of f(x). These robust statistics provide the basis for nonlinear analysis while retaining linear equivalences. NNS offers: Numerical integration, Numerical differentiation, Clustering, Correlation, Dependence, Causal analysis, ANOVA, Regression, Classification, Seasonality, Autoregressive modeling, Normalization, Stochastic dominance and Advanced Monte Carlo sampling. All routines based on: Viole, F. and Nawrocki, D. (2013), Nonlinear Nonparametric Statistics: Using Partial Moments (ISBN: 1490523995).
Maintained by Fred Viole. Last updated 4 days ago.
clusteringeconometricsmachine-learningnonlinearnonparametricpartial-momentsstatisticstime-seriescpp
5.8 match 71 stars 10.96 score 66 scripts 3 dependentsr-lib
testthat:Unit Testing for R
Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.
Maintained by Hadley Wickham. Last updated 15 days ago.
3.0 match 900 stars 20.97 score 74k scripts 465 dependentstidyverse
lubridate:Make Dealing with Dates a Little Easier
Functions to work with date-times and time-spans: fast and user friendly parsing of date-time data, extraction and updating of components of a date-time (years, months, days, hours, minutes, and seconds), algebraic manipulation on date-time and time-span objects. The 'lubridate' package has a consistent and memorable syntax that makes working with dates easy and fun.
Maintained by Vitalie Spinu. Last updated 3 months ago.
3.0 match 757 stars 20.95 score 135k scripts 1.9k dependentssonsoleslp
tna:Transition Network Analysis (TNA)
Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.
Maintained by Sonsoles López-Pernas. Last updated 2 days ago.
educational-data-mininglearning-analyticsmarkov-modeltemporal-analysis
9.7 match 4 stars 6.48 score 5 scriptsthibautjombart
treespace:Statistical Exploration of Landscapes of Phylogenetic Trees
Tools for the exploration of distributions of phylogenetic trees. This package includes a 'shiny' interface which can be started from R using treespaceServer(). For further details see Jombart et al. (2017) <DOI:10.1111/1755-0998.12676>.
Maintained by Michelle Kendall. Last updated 2 years ago.
8.4 match 28 stars 7.39 score 63 scriptssym33
RecordLinkage:Record Linkage Functions for Linking and Deduplicating Data Sets
Provides functions for linking and deduplicating data sets. Methods based on a stochastic approach are implemented as well as classification algorithms from the machine learning domain. For details, see our paper "The RecordLinkage Package: Detecting Errors in Data" Sariyar M / Borg A (2010) <doi:10.32614/RJ-2010-017>.
Maintained by Murat Sariyar. Last updated 2 years ago.
6.8 match 6 stars 9.00 score 454 scripts 8 dependentsmarc-girondot
HelpersMG:Tools for Environmental Analyses, Ecotoxicology and Various R Functions
Contains miscellaneous functions useful for managing 'NetCDF' files (see <https://en.wikipedia.org/wiki/NetCDF>), get moon phase and time for sun rise and fall, tide level, analyse and reconstruct periodic time series of temperature with irregular sinusoidal pattern, show scales and wind rose in plot with change of color of text, Metropolis-Hastings algorithm for Bayesian MCMC analysis, plot graphs or boxplot with error bars, search files in disk by there names or their content, read the contents of all files from a folder at one time.
Maintained by Marc Girondot. Last updated 2 months ago.
13.1 match 4 stars 4.59 score 160 scripts 4 dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
3.8 match 18 stars 16.05 score 1.0k scripts 1.9k dependentsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 11 days ago.
4.9 match 46 stars 12.34 score 1.2k scripts 13 dependentsnhanhocu
metamicrobiomeR:an R package for analysis of microbiome relative abundance data using zero inflated beta GAMLSS and meta-analysis across studies using random effect model
The metamicrobiomeR package implements Generalized Additive Model for Location, Scale and Shape (GAMLSS) with zero inflated beta (BEZI) family for analysis of microbiome relative abundance data (with various options for data transformation/normalization to address compositional effects) and random effect meta-analysis models for meta-analysis pooling estimates across microbiome studies. Random Forest model to predict microbiome age based on relative abundances of shared bacterial genera with the Bangladesh data (Subramanian et al 2014), comparison of multiple diversity indexes using linear/linear mixed effect models and some data display/visualization are also implemented.
Maintained by Nhan Ho. Last updated 4 years ago.
12.3 match 33 stars 4.90 score 12 scriptscschwarz-stat-sfu-ca
SPAS:Stratified-Petersen Analysis System
The Stratified-Petersen Analysis System (SPAS) is designed to estimate abundance in two-sample capture-recapture experiments where the capture and recaptures are stratified. This is a generalization of the simple Lincoln-Petersen estimator. Strata may be defined in time or in space or both, and the s strata in which marking takes place may differ from the t strata in which recoveries take place. When s=t, SPAS reduces to the method described by Darroch (1961) <doi:10.2307/2332748>. When s<t, SPAS implements the methods described in Plante, Rivest, and Tremblay (1988) <doi:10.2307/2533994>. Schwarz and Taylor (1998) <doi:10.1139/f97-238> describe the use of SPAS in estimating return of salmon stratified by time and geography. A related package, BTSPAS, deals with temporal stratification where a spline is used to model the distribution of the population over time as it passes the second capture location. This is the R-version of the (now obsolete) standalone Windows program of the same name.
Maintained by Carl James Schwarz. Last updated 1 months ago.
9.1 match 2 stars 6.55 score 28 scripts 1 dependentsmatteo21q
dani:Design and Analysis of Non-Inferiority Trials
Provides tools to help with the design and analysis of non-inferiority trials. These include functions for doing sample size calculations and for analysing non-inferiority trials, using a variety of outcome types and population-level sumamry measures. It also features functions to make trials more resilient by using the concept of non-inferiority frontiers, as described in Quartagno et al. (2019) <arXiv:1905.00241>. Finally it includes function to design and analyse MAMS-ROCI (aka DURATIONS) trials.
Maintained by Matteo Quartagno. Last updated 7 months ago.
11.1 match 2 stars 5.33 score 27 scriptsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 28 days ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
5.0 match 55 stars 11.77 score 1.2k scripts 2 dependentsxfim
ggmcmc:Tools for Analyzing MCMC Simulations from Bayesian Inference
Tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables, and functions to work with hierarchical/multilevel batches of parameters (Fernández-i-Marín, 2016 <doi:10.18637/jss.v070.i09>).
Maintained by Xavier Fernández i Marín. Last updated 2 years ago.
bayesian-data-analysisggplot2graphicaljagsmcmcstan
4.9 match 112 stars 12.02 score 1.6k scripts 8 dependentsbrry
berryFunctions:Function Collection Related to Plotting and Hydrology
Draw horizontal histograms, color scattered points by 3rd dimension, enhance date- and log-axis plots, zoom in X11 graphics, trace errors and warnings, use the unit hydrograph in a linear storage cascade, convert lists to data.frames and arrays, fit multiple functions.
Maintained by Berry Boessenkool. Last updated 1 months ago.
6.3 match 13 stars 9.43 score 350 scripts 16 dependentsbioc
orthogene:Interspecies gene mapping
`orthogene` is an R package for easy mapping of orthologous genes across hundreds of species. It pulls up-to-date gene ortholog mappings across **700+ organisms**. It also provides various utility functions to aggregate/expand common objects (e.g. data.frames, gene expression matrices, lists) using **1:1**, **many:1**, **1:many** or **many:many** gene mappings, both within- and between-species.
Maintained by Brian Schilder. Last updated 5 months ago.
geneticscomparativegenomicspreprocessingphylogeneticstranscriptomicsgeneexpressionanimal-modelsbioconductorbioconductor-packagebioinformaticsbiomedicinecomparative-genomicsevolutionary-biologygenesgenomicsontologiestranslational-research
7.5 match 42 stars 7.85 score 31 scripts 2 dependentsasardaes
dtwclust:Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance
Time series clustering along with optimized techniques related to the Dynamic Time Warping distance and its corresponding lower bounds. Implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.
Maintained by Alexis Sarda. Last updated 8 months ago.
clusteringdtwtime-seriesopenblascpp
4.7 match 261 stars 12.39 score 406 scripts 14 dependentssalvatoremangiafico
rcompanion:Functions to Support Extension Education Program Evaluation
Functions and datasets to support Summary and Analysis of Extension Program Evaluation in R, and An R Companion for the Handbook of Biological Statistics. Vignettes are available at <https://rcompanion.org>.
Maintained by Salvatore Mangiafico. Last updated 30 days ago.
7.0 match 4 stars 8.01 score 2.4k scripts 5 dependentsmodeloriented
DALEXtra:Extension for 'DALEX' Package
Provides wrapper of various machine learning models. In applied machine learning, there is a strong belief that we need to strike a balance between interpretability and accuracy. However, in field of the interpretable machine learning, there are more and more new ideas for explaining black-box models, that are implemented in 'R'. 'DALEXtra' creates 'DALEX' Biecek (2018) <arXiv:1806.08915> explainer for many type of models including those created using 'python' 'scikit-learn' and 'keras' libraries, and 'java' 'h2o' library. Important part of the package is Champion-Challenger analysis and innovative approach to model performance across subsets of test data presented in Funnel Plot.
Maintained by Szymon Maksymiuk. Last updated 2 years ago.
7.2 match 67 stars 7.71 score 400 scripts 1 dependentsannennenne
causalDisco:Tools for Causal Discovery on Observational Data
Various tools for inferring causal models from observational data. The package includes an implementation of the temporal Peter-Clark (TPC) algorithm. Petersen, Osler and Ekstrøm (2021) <doi:10.1093/aje/kwab087>. It also includes general tools for evaluating differences in adjacency matrices, which can be used for evaluating performance of causal discovery procedures.
Maintained by Anne Helby Petersen. Last updated 13 days ago.
11.7 match 19 stars 4.76 score 10 scriptsveseshan
clinfun:Clinical Trial Design and Data Analysis Functions
Utilities to make your clinical collaborations easier if not fun. It contains functions for designing studies such as Simon 2-stage and group sequential designs and for data analysis such as Jonckheere-Terpstra test and estimating survival quantiles.
Maintained by Venkatraman E. Seshan. Last updated 1 years ago.
7.0 match 5 stars 7.86 score 124 scripts 8 dependentsfemiguez
apsimx:Inspect, Read, Edit and Run 'APSIM' "Next Generation" and 'APSIM' Classic
The functions in this package inspect, read, edit and run files for 'APSIM' "Next Generation" ('JSON') and 'APSIM' "Classic" ('XML'). The files with an 'apsim' extension correspond to 'APSIM' Classic (7.x) - Windows only - and the ones with an 'apsimx' extension correspond to 'APSIM' "Next Generation". For more information about 'APSIM' see (<https://www.apsim.info/>) and for 'APSIM' next generation (<https://apsimnextgeneration.netlify.app/>).
Maintained by Fernando Miguez. Last updated 2 days ago.
5.7 match 59 stars 9.71 score 68 scripts 2 dependentssteve-the-bayesian
Boom:Bayesian Object Oriented Modeling
A C++ library for Bayesian modeling, with an emphasis on Markov chain Monte Carlo. Although boom contains a few R utilities (mainly plotting functions), its primary purpose is to install the BOOM C++ library on your system so that other packages can link against it.
Maintained by Steven L. Scott. Last updated 1 years ago.
11.4 match 9 stars 4.82 score 57 scripts 6 dependentsdkaschek
dMod:Dynamic Modeling and Parameter Estimation in ODE Models
The framework provides functions to generate ODEs of reaction networks, parameter transformations, observation functions, residual functions, etc. The framework follows the paradigm that derivative information should be used for optimization whenever possible. Therefore, all major functions produce and can handle expressions for symbolic derivatives.
Maintained by Daniel Kaschek. Last updated 9 days ago.
6.5 match 20 stars 8.35 score 251 scriptsmlcollyer
RRPP:Linear Model Evaluation with Randomized Residuals in a Permutation Procedure
Linear model calculations are made for many random versions of data. Using residual randomization in a permutation procedure, sums of squares are calculated over many permutations to generate empirical probability distributions for evaluating model effects. Additionally, coefficients, statistics, fitted values, and residuals generated over many permutations can be used for various procedures including pairwise tests, prediction, classification, and model comparison. This package should provide most tools one could need for the analysis of high-dimensional data, especially in ecology and evolutionary biology, but certainly other fields, as well.
Maintained by Michael Collyer. Last updated 25 days ago.
5.5 match 4 stars 9.84 score 173 scripts 7 dependentssingmann
afex:Analysis of Factorial Experiments
Convenience functions for analyzing factorial experiments using ANOVA or mixed models. aov_ez(), aov_car(), and aov_4() allow specification of between, within (i.e., repeated-measures), or mixed (i.e., split-plot) ANOVAs for data in long format (i.e., one observation per row), automatically aggregating multiple observations per individual and cell of the design. mixed() fits mixed models using lme4::lmer() and computes p-values for all fixed effects using either Kenward-Roger or Satterthwaite approximation for degrees of freedom (LMM only), parametric bootstrap (LMMs and GLMMs), or likelihood ratio tests (LMMs and GLMMs). afex_plot() provides a high-level interface for interaction or one-way plots using ggplot2, combining raw data and model estimates. afex uses type 3 sums of squares as default (imitating commercial statistical software).
Maintained by Henrik Singmann. Last updated 7 months ago.
3.8 match 123 stars 14.50 score 1.4k scripts 15 dependentswelch-lab
rliger:Linked Inference of Genomic Experimental Relationships
Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.
Maintained by Yichen Wang. Last updated 2 months ago.
nonnegative-matrix-factorizationsingle-cellopenblascpp
5.0 match 402 stars 10.80 score 334 scripts 1 dependentsneptune-ai
neptune:MLOps Metadata Store - Experiment Tracking and Model Registry for Production Teams
An interface to Neptune. A metadata store for MLOps, built for teams that run a lot of experiments. It gives you a single place to log, store, display, organize, compare, and query all your model-building metadata. Neptune is used for: • Experiment tracking: Log, display, organize, and compare ML experiments in a single place. • Model registry: Version, store, manage, and query trained models, and model building metadata. • Monitoring ML runs live: Record and monitor model training, evaluation, or production runs live For more information see <https://neptune.ai/>.
Maintained by Rafal Jankowski. Last updated 2 years ago.
comparelanguagelogmanagementmetadatametricsmlopsmodelsmonitoringorganizeparametersstoretrackervisualization
10.8 match 14 stars 4.89 score 16 scriptspolkas
pacs:Supplementary Tools for R Packages Developers
Supplementary utils for CRAN maintainers and R packages developers. Validating the library, packages and lock files. Exploring a complexity of a specific package like evaluating its size in bytes with all dependencies. The shiny app complexity could be explored too. Assessing the life duration of a specific package version. Checking a CRAN package check page status for any errors and warnings. Retrieving a DESCRIPTION or NAMESPACE file for any package version. Comparing DESCRIPTION or NAMESPACE files between different package versions. Getting a list of all releases for a specific package. The Bioconductor is partly supported.
Maintained by Maciej Nasinski. Last updated 6 months ago.
bioconductordependencieslibrarylifedurationrenvshinytoolsutils
9.2 match 25 stars 5.70 score 8 scriptsacorg
Racmacs:Antigenic Cartography Macros
A toolkit for making antigenic maps from immunological assay data, in order to quantify and visualize antigenic differences between different pathogen strains as described in Smith et al. (2004) <doi:10.1126/science.1097211> and used in the World Health Organization influenza vaccine strain selection process. Additional functions allow for the diagnostic evaluation of antigenic maps and an interactive viewer is provided to explore antigenic relationships amongst several strains and incorporate the visualization of associated genetic information.
Maintained by Sam Wilks. Last updated 9 months ago.
6.4 match 21 stars 8.06 score 362 scriptsr-forge
Matrix:Sparse and Dense Matrix Classes and Methods
A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.
Maintained by Martin Maechler. Last updated 6 days ago.
3.0 match 1 stars 17.23 score 33k scripts 12k dependentskwb-r
kwb.utils:General Utility Functions Developed at KWB
This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).
Maintained by Hauke Sonnenberg. Last updated 12 months ago.
7.0 match 8 stars 7.33 score 12 scripts 78 dependentsravingmantis
unittest:TAP-Compliant Unit Testing
Concise TAP <http://testanything.org/> compliant unit testing package. Authored tests can be run using CMD check with minimal implementation overhead.
Maintained by Jamie Lentin. Last updated 7 months ago.
6.9 match 4 stars 7.43 score 224 scriptsgksmyth
statmod:Statistical Modeling
A collection of algorithms and functions to aid statistical modeling. Includes limiting dilution analysis (aka ELDA), growth curve comparisons, mixed linear models, heteroscedastic regression, inverse-Gaussian probability calculations, Gauss quadrature and a secure convergence algorithm for nonlinear models. Also includes advanced generalized linear model functions including Tweedie and Digamma distributional families, secure convergence and exact distributional calculations for unit deviances.
Maintained by Gordon Smyth. Last updated 2 years ago.
5.3 match 1 stars 9.62 score 2.2k scripts 849 dependentsdariah-fi-survey-concept-network
finnsurveytext:Analyse Open-Ended Survey Responses in Finnish
Annotates Finnish textual survey responses into CoNLL-U format using Finnish treebanks from <https://universaldependencies.org/format.html> using UDPipe as described in Straka and Straková (2017) <doi:10.18653/v1/K17-3009>. Formatted data is then analysed using single or comparison n-gram plots, wordclouds, summary tables and Concept Network plots. The Concept Network plots use the TextRank algorithm as outlined in Mihalcea, Rada & Tarau, Paul (2004) <https://aclanthology.org/W04-3252/>.
Maintained by Adeline Clarke. Last updated 9 days ago.
9.4 match 5.39 score 27 scriptsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 18 days ago.
aspectdistancefragmentationfragmentation-indicesgisgrassgrass-gisrasterraster-projectionrasterizeslopetopographyvectorization
6.6 match 58 stars 7.69 score 8 scriptsbioc
SummarizedExperiment:A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Maintained by Hervé Pagès. Last updated 5 months ago.
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
3.0 match 34 stars 16.85 score 8.6k scripts 1.2k dependentsbioc
countsimQC:Compare Characteristic Features of Count Data Sets
countsimQC provides functionality to create a comprehensive report comparing a broad range of characteristics across a collection of count matrices. One important use case is the comparison of one or more synthetic count matrices to a real count matrix, possibly the one underlying the simulations. However, any collection of count matrices can be compared.
Maintained by Charlotte Soneson. Last updated 3 months ago.
microbiomernaseqsinglecellexperimentaldesignqualitycontrolreportwritingvisualizationimmunooncology
6.5 match 27 stars 7.69 score 24 scriptsjinkim3
kim:A Toolkit for Behavioral Scientists
A collection of functions for analyzing data typically collected or used by behavioral scientists. Examples of the functions include a function that compares groups in a factorial experimental design, a function that conducts two-way analysis of variance (ANOVA), and a function that cleans a data set generated by Qualtrics surveys. Some of the functions will require installing additional package(s). Such packages and other references are cited within the section describing the relevant functions. Many functions in this package rely heavily on these two popular R packages: Dowle et al. (2021) <https://CRAN.R-project.org/package=data.table>. Wickham et al. (2021) <https://CRAN.R-project.org/package=ggplot2>.
Maintained by Jin Kim. Last updated 18 days ago.
10.8 match 7 stars 4.66 score 3 scriptsklausvigo
phangorn:Phylogenetic Reconstruction and Analysis
Allows for estimation of phylogenetic trees and networks using Maximum Likelihood, Maximum Parsimony, distance methods and Hadamard conjugation (Schliep 2011). Offers methods for tree comparison, model selection and visualization of phylogenetic networks as described in Schliep et al. (2017).
Maintained by Klaus Schliep. Last updated 1 months ago.
softwaretechnologyqualitycontrolphylogenetic-analysisphylogeneticsopenblascpp
3.0 match 206 stars 16.69 score 2.5k scripts 135 dependentsquanteda
quanteda:Quantitative Analysis of Textual Data
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Maintained by Kenneth Benoit. Last updated 2 months ago.
corpusnatural-language-processingquantedatext-analyticsonetbbcpp
3.0 match 851 stars 16.68 score 5.4k scripts 51 dependentsbioc
syntenet:Inference And Analysis Of Synteny Networks
syntenet can be used to infer synteny networks from whole-genome protein sequences and analyze them. Anchor pairs are detected with the MCScanX algorithm, which was ported to this package with the Rcpp framework for R and C++ integration. Anchor pairs from synteny analyses are treated as an undirected unweighted graph (i.e., a synteny network), and users can perform: i. network clustering; ii. phylogenomic profiling (by identifying which species contain which clusters) and; iii. microsynteny-based phylogeny reconstruction with maximum likelihood.
Maintained by Fabrício Almeida-Silva. Last updated 3 months ago.
softwarenetworkinferencefunctionalgenomicscomparativegenomicsphylogeneticssystemsbiologygraphandnetworkwholegenomenetworkcomparative-genomicsevolutionary-genomicsnetwork-sciencephylogenomicssyntenysynteny-networkcpp
7.5 match 26 stars 6.67 score 12 scripts 1 dependentshojsgaard
geepack:Generalized Estimating Equation Package
Generalized estimating equations solver for parameters in mean, scale, and correlation structures, through mean link, scale link, and correlation link. Can also handle clustered categorical responses. See e.g. Halekoh and Højsgaard, (2005, <doi:10.18637/jss.v015.i02>), for details.
Maintained by Søren Højsgaard. Last updated 7 months ago.
5.2 match 1 stars 9.59 score 1.7k scripts 43 dependentsmdsteiner
EFAtools:Fast and Flexible Implementations of Exploratory Factor Analysis Tools
Provides functions to perform exploratory factor analysis (EFA) procedures and compare their solutions. The goal is to provide state-of-the-art factor retention methods and a high degree of flexibility in the EFA procedures. This way, for example, implementations from R 'psych' and 'SPSS' can be compared. Moreover, functions for Schmid-Leiman transformation and the computation of omegas are provided. To speed up the analyses, some of the iterative procedures, like principal axis factoring (PAF), are implemented in C++.
Maintained by Markus Steiner. Last updated 3 months ago.
7.5 match 10 stars 6.57 score 83 scripts 1 dependentsr-lib
rcmdcheck:Run 'R CMD check' from 'R' and Capture Results
Run 'R CMD check' from 'R' and capture the results of the individual checks. Supports running checks in the background, timeouts, pretty printing and comparing check results.
Maintained by Gábor Csárdi. Last updated 5 months ago.
4.0 match 116 stars 12.34 score 102 scripts 158 dependentsdjvanderlaan
reclin2:Record Linkage Toolkit
Functions to assist in performing probabilistic record linkage and deduplication: generating pairs, comparing records, em-algorithm for estimating m- and u-probabilities (I. Fellegi & A. Sunter (1969) <doi:10.1080/01621459.1969.10501049>, T.N. Herzog, F.J. Scheuren, & W.E. Winkler (2007), "Data Quality and Record Linkage Techniques", ISBN:978-0-387-69502-0), forcing one-to-one matching. Can also be used for pre- and post-processing for machine learning methods for record linkage. Focus is on memory, CPU performance and flexibility.
Maintained by Jan van der Laan. Last updated 1 years ago.
6.7 match 43 stars 7.36 score 89 scripts 1 dependentscran
survRM2:Comparing Restricted Mean Survival Time
Performs two-sample comparisons using the restricted mean survival time (RMST) as a summary measure of the survival time distribution. Three kinds of between-group contrast metrics (i.e., the difference in RMST, the ratio of RMST and the ratio of the restricted mean time lost (RMTL)) are computed. It performs an ANCOVA-type covariate adjustment as well as unadjusted analyses for those measures.
Maintained by Hajime Uno. Last updated 3 years ago.
9.3 match 2 stars 5.26 score 5 dependentsbioc
bluster:Clustering Algorithms for Bioconductor
Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologysoftwaregeneexpressiontranscriptomicssinglecellclusteringcpp
5.2 match 9.43 score 636 scripts 51 dependentsbioc
MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Maintained by Laurent Gatto. Last updated 1 days ago.
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
3.8 match 130 stars 12.81 score 772 scripts 36 dependentsbioc
MPRAnalyze:Statistical Analysis of MPRA data
MPRAnalyze provides statistical framework for the analysis of data generated by Massively Parallel Reporter Assays (MPRAs), used to directly measure enhancer activity. MPRAnalyze can be used for quantification of enhancer activity, classification of active enhancers and comparative analyses of enhancer activity between conditions. MPRAnalyze construct a nested pair of generalized linear models (GLMs) to relate the DNA and RNA observations, easily adjustable to various experimental designs and conditions, and provides a set of rigorous statistical testig schemes.
Maintained by Tal Ashuach. Last updated 5 months ago.
immunooncologysoftwarestatisticalmethodsequencinggeneexpressioncellbiologycellbasedassaysdifferentialexpressionexperimentaldesignclassification
7.1 match 12 stars 6.86 score 30 scriptsbioc
doubletrouble:Identification and classification of duplicated genes
doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.
Maintained by Fabrício Almeida-Silva. Last updated 3 days ago.
softwarewholegenomecomparativegenomicsfunctionalgenomicsphylogeneticsnetworkclassificationbioinformaticscomparative-genomicsgene-duplicationmolecular-evolutionwhole-genome-duplication
7.5 match 23 stars 6.44 score 17 scriptslmarusich
rmcorr:Repeated Measures Correlation
Compute the repeated measures correlation, a statistical technique for determining the overall within-individual relationship among paired measures assessed on two or more occasions, first introduced by Bland and Altman (1995). Includes functions for diagnostics, p-value, effect size with confidence interval including optional bootstrapping, as well as graphing. Also includes several example datasets. For more details, see the web documentation <https://lmarusich.github.io/rmcorr/index.html> and the original paper: Bakdash and Marusich (2017) <doi:10.3389/fpsyg.2017.00456>.
Maintained by Laura R. Marusich. Last updated 7 months ago.
5.3 match 7 stars 9.18 score 304 scriptsyonicd
ggedit:Interactive 'ggplot2' Layer and Theme Aesthetic Editor
Interactively edit 'ggplot2' layer and theme aesthetics definitions.
Maintained by Jonathan Sidi. Last updated 10 months ago.
6.0 match 250 stars 7.95 score 116 scripts 3 dependentsmerck
psm3mkv:Evaluate Partitioned Survival and State Transition Models
Fits and evaluates three-state partitioned survival analyses (PartSAs) and Markov models (clock forward or clock reset) to progression and overall survival data typically collected in oncology clinical trials. These model structures are typically considered in cost-effectiveness modeling in advanced/metastatic cancer indications. Muston (2024). "Informing structural assumptions for three state oncology cost-effectiveness models through model efficiency and fit". Applied Health Economics and Health Policy.
Maintained by Dominic Muston. Last updated 9 months ago.
7.4 match 10 stars 6.43 score 1 scriptsjeffreyevans
yaImpute:Nearest Neighbor Observation Imputation and Evaluation Tools
Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.
Maintained by Jeffrey S. Evans. Last updated 6 months ago.
6.4 match 3 stars 7.40 score 94 scripts 12 dependentsopen-aims
bayesnec:A Bayesian No-Effect- Concentration (NEC) Algorithm
Implementation of No-Effect-Concentration estimation that uses 'brms' (see Burkner (2017)<doi:10.18637/jss.v080.i01>; Burkner (2018)<doi:10.32614/RJ-2018-017>; Carpenter 'et al.' (2017)<doi:10.18637/jss.v076.i01> to fit concentration(dose)-response data using Bayesian methods for the purpose of estimating 'ECx' values, but more particularly 'NEC' (see Fox (2010)<doi:10.1016/j.ecoenv.2009.09.012>), 'NSEC' (see Fisher and Fox (2023)<doi:10.1002/etc.5610>), and 'N(S)EC (see Fisher et al. 2023<doi:10.1002/ieam.4809>). A full description of this package can be found in Fisher 'et al.' (2024)<doi:10.18637/jss.v110.i05>. This package expands and supersedes an original version implemented in 'R2jags' (see Su and Yajima (2020)<https://CRAN.R-project.org/package=R2jags>; Fisher et al. (2020)<doi:10.5281/ZENODO.3966864>).
Maintained by Rebecca Fisher. Last updated 7 months ago.
bayesian-inferenceconcentration-responseecotoxicologyno-effect-concentrationnon-linear-decaythreshold-derivationtoxicology
5.8 match 12 stars 8.11 score 360 scriptsjepusto
scdhlm:Estimating Hierarchical Linear Models for Single-Case Designs
Provides a set of tools for estimating hierarchical linear models and effect sizes based on data from single-case designs. Functions are provided for calculating standardized mean difference effect sizes that are directly comparable to standardized mean differences estimated from between-subjects randomized experiments, as described in Hedges, Pustejovsky, and Shadish (2012) <DOI:10.1002/jrsm.1052>; Hedges, Pustejovsky, and Shadish (2013) <DOI:10.1002/jrsm.1086>; Pustejovsky, Hedges, and Shadish (2014) <DOI:10.3102/1076998614547577>; and Chen, Pustejovsky, Klingbeil, and Van Norman (2023) <DOI:10.1016/j.jsp.2023.02.002>. Includes an interactive web interface.
Maintained by James Pustejovsky. Last updated 1 years ago.
8.4 match 4 stars 5.62 score 52 scriptsbioc
HiCcompare:HiCcompare: Joint normalization and comparative analysis of multiple Hi-C datasets
HiCcompare provides functions for joint normalization and difference detection in multiple Hi-C datasets. HiCcompare operates on processed Hi-C data in the form of chromosome-specific chromatin interaction matrices. It accepts three-column tab-separated text files storing chromatin interaction matrices in a sparse matrix format which are available from several sources. HiCcompare is designed to give the user the ability to perform a comparative analysis on the 3-Dimensional structure of the genomes of cells in different biological states.`HiCcompare` differs from other packages that attempt to compare Hi-C data in that it works on processed data in chromatin interaction matrix format instead of pre-processed sequencing data. In addition, `HiCcompare` provides a non-parametric method for the joint normalization and removal of biases between two Hi-C datasets for the purpose of comparative analysis. `HiCcompare` also provides a simple yet robust method for detecting differences between Hi-C datasets.
Maintained by Mikhail Dozmorov. Last updated 5 months ago.
softwarehicsequencingnormalizationdifference-detectionhi-cvisualization
5.5 match 19 stars 8.61 score 51 scripts 5 dependentspredictiveecology
Require:Installing and Loading R Packages for Reproducible Workflows
A single key function, 'Require' that makes rerun-tolerant versions of 'install.packages' and `require` for CRAN packages, packages no longer on CRAN (i.e., archived), specific versions of packages, and GitHub packages. This approach is developed to create reproducible workflows that are flexible and fast enough to use while in development stages, while able to build snapshots once a stable package collection is found. As with other functions in a reproducible workflow, this package emphasizes functions that return the same result whether it is the first or subsequent times running the function, with subsequent times being sufficiently fast that they can be run every time without undue waiting burden on the user or developer.
Maintained by Eliot J B McIntire. Last updated 14 days ago.
5.0 match 22 stars 9.42 score 144 scripts 13 dependentscjvanlissa
tidySEM:Tidy Structural Equation Modeling
A tidy workflow for generating, estimating, reporting, and plotting structural equation models using 'lavaan', 'OpenMx', or 'Mplus'. Throughout this workflow, elements of syntax, results, and graphs are represented as 'tidy' data, making them easy to customize. Includes functionality to estimate latent class analyses, and to plot 'dagitty' and 'igraph' objects.
Maintained by Caspar J. van Lissa. Last updated 7 days ago.
4.4 match 58 stars 10.69 score 330 scripts 1 dependentsdrostlab
philentropy:Similarity and Distance Quantification Between Probability Functions
Computes 46 optimized distance and similarity measures for comparing probability functions (Drost (2018) <doi:10.21105/joss.00765>). These comparisons between probability functions have their foundations in a broad range of scientific disciplines from mathematics to ecology. The aim of this package is to provide a core framework for clustering, classification, statistical inference, goodness-of-fit, non-parametric statistics, information theory, and machine learning tasks that are based on comparing univariate or multivariate probability functions.
Maintained by Hajk-Georg Drost. Last updated 3 months ago.
distance-measuresdistance-quantificationinformation-theoryjensen-shannon-divergenceparametric-distributionssimilarity-measuresstatisticscpp
3.8 match 137 stars 12.44 score 484 scripts 24 dependentsrichfitz
diversitree:Comparative 'Phylogenetic' Analyses of Diversification
Contains a number of comparative 'phylogenetic' methods, mostly focusing on analysing diversification and character evolution. Contains implementations of 'BiSSE' (Binary State 'Speciation' and Extinction) and its unresolved tree extensions, 'MuSSE' (Multiple State 'Speciation' and Extinction), 'QuaSSE', 'GeoSSE', and 'BiSSE-ness' Other included methods include Markov models of discrete and continuous trait evolution and constant rate 'speciation' and extinction.
Maintained by Richard G. FitzJohn. Last updated 6 months ago.
5.5 match 33 stars 8.51 score 524 scripts 4 dependentstidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 12 days ago.
1.9 match 4.8k stars 24.68 score 659k scripts 7.8k dependentschrisaberson
pwr2ppl:Power Analyses for Common Designs (Power to the People)
Statistical power analysis for designs including t-tests, correlations, multiple regression, ANOVA, mediation, and logistic regression. Functions accompany Aberson (2019) <doi:10.4324/9781315171500>.
Maintained by Chris Aberson. Last updated 3 years ago.
11.1 match 17 stars 4.16 score 17 scriptsewenharrison
finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling
Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.
Maintained by Ewen Harrison. Last updated 7 months ago.
4.0 match 270 stars 11.43 score 1.0k scriptsbioc
XVector:Foundation of external vector representation and manipulation in Bioconductor
Provides memory efficient S4 classes for storing sequences "externally" (e.g. behind an R external pointer, or on disk).
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-packagezlib
4.0 match 2 stars 11.36 score 67 scripts 1.7k dependentsbioc
compcodeR:RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods
This package provides extensive functionality for comparing results obtained by different methods for differential expression analysis of RNAseq data. It also contains functions for simulating count data. Finally, it provides convenient interfaces to several packages for performing the differential expression analysis. These can also be used as templates for setting up and running a user-defined differential analysis workflow within the framework of the package.
Maintained by Charlotte Soneson. Last updated 3 months ago.
immunooncologyrnaseqdifferentialexpression
5.6 match 11 stars 8.06 score 26 scriptsms609
Quartet:Comparison of Phylogenetic Trees Using Quartet and Split Measures
Calculates the number of four-taxon subtrees consistent with a pair of cladograms, calculating the symmetric quartet distance of Bandelt & Dress (1986), Reconstructing the shape of a tree from observed dissimilarity data, Advances in Applied Mathematics, 7, 309-343 <doi:10.1016/0196-8858(86)90038-2>, and using the tqDist algorithm of Sand et al. (2014), tqDist: a library for computing the quartet and triplet distances between binary or general trees, Bioinformatics, 30, 2079–2080 <doi:10.1093/bioinformatics/btu157> for pairs of binary trees.
Maintained by Martin R. Smith. Last updated 2 months ago.
bioinformaticscomparisonphylogenetic-treesphylogeneticsquartetquartet-distanceresearch-tooltreecpp
5.6 match 14 stars 8.00 score 40 scriptssimongrund1
mitml:Tools for Multiple Imputation in Multilevel Modeling
Provides tools for multiple imputation of missing data in multilevel modeling. Includes a user-friendly interface to the packages 'pan' and 'jomo', and several functions for visualization, data management and the analysis of multiply imputed data sets.
Maintained by Simon Grund. Last updated 1 years ago.
imputationmissing-datamixed-effectsmultilevel-datamultilevel-models
3.6 match 29 stars 12.36 score 246 scripts 153 dependentsnicebread
RSA:Response Surface Analysis
Advanced response surface analysis. The main function RSA computes and compares several nested polynomial regression models (full second- or third-order polynomial, shifted and rotated squared difference model, rising ridge surfaces, basic squared difference model, asymmetric or level-dependent congruence effect models). The package provides plotting functions for 3d wireframe surfaces, interactive 3d plots, and contour plots. Calculates many surface parameters (a1 to a5, principal axes, stationary point, eigenvalues) and provides standard, robust, or bootstrapped standard errors and confidence intervals for them.
Maintained by Felix Schönbrodt. Last updated 11 months ago.
7.1 match 17 stars 6.30 score 26 scripts 1 dependentsdesctable
desctable:Produce Descriptive and Comparative Tables Easily
Easily create descriptive and comparative tables. It makes use and integrates directly with the tidyverse family of packages, and pipes. Tables are produced as (nested) dataframes for easy manipulation.
Maintained by Maxime Wack. Last updated 3 years ago.
6.5 match 52 stars 6.85 score 45 scriptscomputationalstylistics
stylo:Stylometric Multivariate Analyses
Supervised and unsupervised multivariate methods, supplemented by GUI and some visualizations, to perform various analyses in the field of computational stylistics, authorship attribution, etc. For further reference, see Eder et al. (2016), <https://journal.r-project.org/archive/2016/RJ-2016-007/index.html>. You are also encouraged to visit the Computational Stylistics Group's website <https://computationalstylistics.github.io/>, where a reasonable amount of information about the package and related projects are provided.
Maintained by Maciej Eder. Last updated 2 months ago.
5.2 match 186 stars 8.59 score 462 scriptsyihui
knitr:A General-Purpose Package for Dynamic Report Generation in R
Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.
Maintained by Yihui Xie. Last updated 1 days ago.
dynamic-documentsknitrliterate-programmingrmarkdownsweave
1.9 match 2.4k stars 23.62 score 116k scripts 4.2k dependentschoonghyunryu
dlookr:Tools for Data Diagnosis, Exploration, Transformation
A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values, outliers, and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and the relationship between the target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputes missing values and outliers, and resolves skewness. And it creates automated reports that support these three tasks.
Maintained by Choonghyun Ryu. Last updated 9 months ago.
4.0 match 212 stars 11.05 score 748 scripts 2 dependentshneth
unikn:Graphical Elements of the University of Konstanz's Corporate Design
Define and use graphical elements of corporate design manuals in R. The 'unikn' package provides color functions (by defining dedicated colors and color palettes, and commands for finding, changing, viewing, and using them) and styled text elements (e.g., for marking, underlining, or plotting colored titles). The pre-defined range of colors and text decoration functions is based on the corporate design of the University of Konstanz <https://www.uni-konstanz.de/>, but can be adapted and extended for other purposes or institutions.
Maintained by Hansjoerg Neth. Last updated 3 months ago.
brandingcolorcolor-palettecolorschemecorporate-designpalettetext-decorationuniversity-colorsvisual-identity
5.0 match 39 stars 8.82 score 156 scripts 2 dependentsradiant-rstats
radiant.basics:Basics Menu for Radiant: Business Analytics using R and Shiny
The Radiant Basics menu includes interfaces for probability calculation, central limit theorem simulation, comparing means and proportions, goodness-of-fit testing, cross-tabs, and correlation. The application extends the functionality in 'radiant.data'.
Maintained by Vincent Nijs. Last updated 10 months ago.
7.9 match 8 stars 5.56 score 79 scripts 3 dependentsgergness
srvyr:'dplyr'-Like Syntax for Summary Statistics of Survey Data
Use piping, verbs like 'group_by' and 'summarize', and other 'dplyr' inspired syntactic style when calculating summary statistics on survey data using functions from the 'survey' package.
Maintained by Greg Freedman Ellis. Last updated 1 months ago.
3.1 match 215 stars 13.88 score 1.8k scripts 15 dependentsnakarinp
longreadvqs:Viral Quasispecies Comparison from Long-Read Sequencing Data
Performs variety of viral quasispecies diversity analyses [see Pamornchainavakul et al. (2024) <doi:10.21203/rs.3.rs-4637890/v1>] based on long-read sequence alignment. Main functions include 1) sequencing error and other noise minimization and read sampling, 2) Single nucleotide variant (SNV) profiles comparison, and 3) viral quasispecies profiles comparison and visualization.
Maintained by Nakarin Pamornchainavakul. Last updated 7 months ago.
9.3 match 4.65 score 4 scriptscran
epiR:Tools for the Analysis of Epidemiological Data
Tools for the analysis of epidemiological and surveillance data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, computation of confidence intervals around incidence risk and incidence rate estimates and sample size calculations for cross-sectional, case-control and cohort studies. Surveillance tools include functions to calculate an appropriate sample size for 1- and 2-stage representative freedom surveys, functions to estimate surveillance system sensitivity and functions to support scenario tree modelling analyses.
Maintained by Mark Stevenson. Last updated 2 months ago.
5.3 match 10 stars 8.18 score 10 dependentsalexkowa
EnvStats:Package for Environmental Statistics, Including US EPA Guidance
Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).
Maintained by Alexander Kowarik. Last updated 16 days ago.
3.4 match 26 stars 12.80 score 2.4k scripts 46 dependentsflr
FLCore:Core Package of FLR, Fisheries Modelling in R
Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.
Maintained by Iago Mosqueira. Last updated 9 days ago.
fisheriesflrfisheries-modelling
4.9 match 16 stars 8.78 score 956 scripts 23 dependentsbioc
TEKRABber:An R package estimates the correlations of orthologs and transposable elements between two species
TEKRABber is made to provide a user-friendly pipeline for comparing orthologs and transposable elements (TEs) between two species. It considers the orthology confidence between two species from BioMart to normalize expression counts and detect differentially expressed orthologs/TEs. Then it provides one to one correlation analysis for desired orthologs and TEs. There is also an app function to have a first insight on the result. Users can prepare orthologs/TEs RNA-seq expression data by their own preference to run TEKRABber following the data structure mentioned in the vignettes.
Maintained by Yao-Chung Chen. Last updated 20 days ago.
differentialexpressionnormalizationtranscriptiongeneexpressionbioconductorcpp
8.0 match 3 stars 5.33 score 18 scriptsgavinsimpson
analogue:Analogue and Weighted Averaging Methods for Palaeoecology
Fits Modern Analogue Technique and Weighted Averaging transfer function models for prediction of environmental data from species data, and related methods used in palaeoecology.
Maintained by Gavin L. Simpson. Last updated 6 months ago.
4.8 match 14 stars 8.96 score 185 scripts 4 dependentsbryanhanson
LearnPCA:Functions, Data Sets and Vignettes to Aid in Learning Principal Components Analysis (PCA)
Principal component analysis (PCA) is one of the most widely used data analysis techniques. This package provides a series of vignettes explaining PCA starting from basic concepts. The primary purpose is to serve as a self-study resource for anyone wishing to understand PCA better. A few convenience functions are provided as well.
Maintained by Bryan A. Hanson. Last updated 10 months ago.
6.8 match 10 stars 6.20 score 1 scriptscivilstat
RankingProject:The Ranking Project: Visualizations for Comparing Populations
Functions to generate plots and tables for comparing independently-sampled populations. Companion package to "A Primer on Visualizations for Comparing Populations, Including the Issue of Overlapping Confidence Intervals" by Wright, Klein, and Wieczorek (2019) <DOI:10.1080/00031305.2017.1392359> and "A Joint Confidence Region for an Overall Ranking of Populations" by Klein, Wright, and Wieczorek (2020) <DOI:10.1111/rssc.12402>.
Maintained by Jerzy Wieczorek. Last updated 3 years ago.
8.4 match 7 stars 5.02 score 10 scriptsropensci
redland:RDF Library Bindings in R
Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at <https://www.w3.org/TR/rdf-primer/>. This package supports RDF by implementing an R interface to the Redland RDF C library, described at <https://librdf.org/docs/api/index.html>. In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.
Maintained by Matthew B. Jones. Last updated 1 years ago.
5.4 match 17 stars 7.85 score 98 scripts 13 dependentsemcramer
CHOIRBM:Plots the CHOIR Body Map
Collection of utility functions for visualizing body map data collected with the Collaborative Health Outcomes Information Registry.
Maintained by Eric Cramer. Last updated 1 years ago.
body-mapcbmchoirdata-visualizationvisualization
7.6 match 5 stars 5.51 score 26 scriptsrstudio
tfruns:Training Run Tools for 'TensorFlow'
Create and manage unique directories for each 'TensorFlow' training run. Provides a unique, time stamped directory for each run along with functions to retrieve the directory of the latest run or latest several runs.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
3.5 match 34 stars 11.80 score 325 scripts 77 dependentsinsightsengineering
tern:Create Common TLGs Used in Clinical Trials
Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.
Maintained by Joe Zhu. Last updated 2 months ago.
clinical-trialsgraphslistingsnestoutputstables
3.3 match 79 stars 12.62 score 186 scripts 9 dependentsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 8 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
5.1 match 105 stars 7.98 scorebioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 6 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
5.3 match 33 stars 7.77 score 10 scriptsbioc
genefu:Computation of Gene Expression-Based Signatures in Breast Cancer
This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.
Maintained by Benjamin Haibe-Kains. Last updated 4 months ago.
differentialexpressiongeneexpressionvisualizationclusteringclassification
5.5 match 7.42 score 193 scripts 3 dependentscsafe-isu
handwriterRF:Handwriting Analysis with Random Forests
Perform forensic handwriting analysis of two scanned handwritten documents. This package implements the statistical method described by Madeline Johnson and Danica Ommen (2021) <doi:10.1002/sam.11566>. Similarity measures and a random forest produce a score-based likelihood ratio that quantifies the strength of the evidence in favor of the documents being written by the same writer or different writers.
Maintained by Stephanie Reinders. Last updated 8 days ago.
6.6 match 2 stars 6.18 score 15 scripts 1 dependentsjglev
veccompare:Perform Set Operations on Vectors, Automatically Generating All n-Wise Comparisons, and Create Markdown Output
Automates set operations (i.e., comparisons of overlap) between multiple vectors. It also contains a function for automating reporting in 'RMarkdown', by generating markdown output for easy analysis, as well as an 'RMarkdown' template for use with 'RStudio'.
Maintained by Jacob Gerard Levernier. Last updated 8 years ago.
11.3 match 8 stars 3.60 score 10 scriptsxrobin
pROC:Display and Analyze ROC Curves
Tools for visualizing, smoothing and comparing receiver operating characteristic (ROC curves). (Partial) area under the curve (AUC) can be compared with statistical tests based on U-statistics or bootstrap. Confidence intervals can be computed for (p)AUC or ROC curves.
Maintained by Xavier Robin. Last updated 4 months ago.
bootstrappingcovariancehypothesis-testingmachine-learningplotplottingrocroc-curvevariancecpp
2.7 match 125 stars 15.18 score 16k scripts 445 dependentsrolkra
explore:Simplifies Exploratory Data Analysis
Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.
Maintained by Roland Krasser. Last updated 3 months ago.
data-explorationdata-visualisationdecision-treesedarmarkdownshinytidy
3.5 match 228 stars 11.43 score 221 scripts 1 dependentsstan-dev
rstanarm:Bayesian Applied Regression Modeling via Stan
Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.
Maintained by Ben Goodrich. Last updated 9 months ago.
bayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticsmultilevel-modelsrstanrstanarmstanstatistical-modelingcpp
2.6 match 393 stars 15.68 score 5.0k scripts 13 dependentsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 10 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
2.5 match 375 stars 16.11 score 17k scripts 115 dependentscsgillespie
poweRlaw:Analysis of Heavy Tailed Distributions
An implementation of maximum likelihood estimators for a variety of heavy tailed distributions, including both the discrete and continuous power law distributions. Additionally, a goodness-of-fit based approach is used to estimate the lower cut-off for the scaling region.
Maintained by Colin Gillespie. Last updated 1 months ago.
3.1 match 112 stars 12.79 score 332 scripts 32 dependents