R-universe search: exports:overlap

easystats

bayestestR:Understand and Describe Bayesian Models and Posterior Distributions

Provides utilities to describe posterior distributions and Bayesian models. It includes point-estimates such as Maximum A Posteriori (MAP), measures of dispersion (Highest Density Interval - HDI; Kruschke, 2015 <doi:10.1016/C2012-0-00477-2>) and indices used for null-hypothesis testing (such as ROPE percentage, pd and Bayes factors). References: Makowski et al. (2021) <doi:10.21105/joss.01541>.

Maintained by Dominique Makowski. Last updated 8 days ago.

bayes-factors bayesfactor bayesian bayesian-framework credible-interval easystats hacktoberfest hdi map posterior-distributions rope

579 stars 16.87 score 2.2k scripts 83 dependents

bioc

ChIPseeker:ChIPseeker for ChIP peak Annotation, Comparison, and Visualization

This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation chipseq software visualization multiplecomparison atac-seq chip-seq comparison epigenetics epigenomics

233 stars 13.05 score 1.6k scripts 5 dependents

bioc

microbiome:Microbiome Analytics

Utilities for microbiome analysis.

Maintained by Leo Lahti. Last updated 5 months ago.

metagenomics microbiome sequencing systemsbiology hitchip hitchip-atlas human-microbiome microbiology microbiome-analysis phyloseq population-study

293 stars 12.51 score 2.0k scripts 5 dependents

gaospecial

ggVennDiagram:A 'ggplot2' Implement of Venn Diagram

Easy-to-use functions to generate 2-7 sets Venn or upset plot in publication quality. 'ggVennDiagram' plot Venn or upset using well-defined geometry dataset and 'ggplot2'. The shapes of 2-4 sets Venn use circles and ellipses, while the shapes of 4-7 sets Venn use irregular polygons (4 has both forms), which are developed and imported from another package 'venn', authored by Adrian Dusa. We provided internal functions to integrate shape data with user provided sets data, and calculated the geometry of every regions/intersections of them, then separately plot Venn in four components, set edges/labels, and region edges/labels. From version 1.0, it is possible to customize these components as you demand in ordinary 'ggplot2' grammar. From version 1.4.4, it supports unlimited number of sets, as it can draw a plain upset plot automatically when number of sets is more than 7.

Maintained by Chun-Hui Gao. Last updated 5 months ago.

set-operations upset upsetplot venn-diagram venn-plot

292 stars 12.31 score 1.3k scripts 4 dependents

ctmm-initiative

ctmm:Continuous-Time Movement Modeling

Functions for identifying, fitting, and applying continuous-space, continuous-time stochastic-process movement models to animal tracking data. The package is described in Calabrese et al (2016) <doi:10.1111/2041-210X.12559>, with models and methods based on those introduced and detailed in Fleming & Calabrese et al (2014) <doi:10.1086/675504>, Fleming et al (2014) <doi:10.1111/2041-210X.12176>, Fleming et al (2015) <doi:10.1103/PhysRevE.91.032107>, Fleming et al (2015) <doi:10.1890/14-2010.1>, Fleming et al (2016) <doi:10.1890/15-1607>, Péron & Fleming et al (2016) <doi:10.1186/s40462-016-0084-7>, Fleming & Calabrese (2017) <doi:10.1111/2041-210X.12673>, Péron et al (2017) <doi:10.1002/ecm.1260>, Fleming et al (2017) <doi:10.1016/j.ecoinf.2017.04.008>, Fleming et al (2018) <doi:10.1002/eap.1704>, Winner & Noonan et al (2018) <doi:10.1111/2041-210X.13027>, Fleming et al (2019) <doi:10.1111/2041-210X.13270>, Noonan & Fleming et al (2019) <doi:10.1186/s40462-019-0177-1>, Fleming et al (2020) <doi:10.1101/2020.06.12.130195>, Noonan et al (2021) <doi:10.1111/2041-210X.13597>, Fleming et al (2022) <doi:10.1111/2041-210X.13815>, Silva et al (2022) <doi:10.1111/2041-210X.13786>, Alston & Fleming et al (2023) <doi:10.1111/2041-210X.14025>.

Maintained by Christen H. Fleming. Last updated 8 days ago.

51 stars 10.54 score 534 scripts 4 dependents

agrdatasci

gdistance:Distances and Routes on Geographical Grids

Provides classes and functions to calculate various distance measures and routes in heterogeneous geographic spaces represented as grids. The package implements measures to model dispersal histories first presented by van Etten and Hijmans (2010) <doi:10.1371/journal.pone.0012060>. Least-cost distances as well as more complex distances based on (constrained) random walks can be calculated. The distances implemented in the package are used in geographical genetics, accessibility indicators, and may also have applications in other fields of geospatial analysis.

Maintained by Andrew Marx. Last updated 1 years ago.

16 stars 10.27 score 478 scripts 23 dependents

jeffreyevans

spatialEco:Spatial Analysis and Modelling Utilities

Utilities to support spatial data manipulation, query, sampling and modelling in ecological applications. Functions include models for species population density, spatial smoothing, multivariate separability, point process model for creating pseudo- absences and sub-sampling, Quadrant-based sampling and analysis, auto-logistic modeling, sampling models, cluster optimization, statistical exploratory tools and raster-based metrics.

Maintained by Jeffrey S. Evans. Last updated 26 days ago.

biodiversity conservation ecology r-spatial raster spatial vector

110 stars 9.55 score 736 scripts 2 dependents

bodkan

slendr:A Simulation Framework for Spatiotemporal Population Genetics

A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.

Maintained by Martin Petr. Last updated 18 hours ago.

popgen population-genetics simulations spatial-statistics

56 stars 9.13 score 88 scripts

bioc

iCOBRA:Comparison and Visualization of Ranking and Assignment Methods

This package provides functions for calculation and visualization of performance metrics for evaluation of ranking and binary classification (assignment) methods. Various types of performance plots can be generated programmatically. The package also contains a shiny application for interactive exploration of results.

Maintained by Charlotte Soneson. Last updated 3 months ago.

classification visualization

14 stars 8.86 score 192 scripts 1 dependents

bioboot

bio3d:Biological Structure Analysis

Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information.

Maintained by Barry Grant. Last updated 5 months ago.

zlib cpp

5 stars 8.47 score 1.4k scripts 10 dependents

bioc

Mfuzz:Soft clustering of omics time series data

The Mfuzz package implements noise-robust soft clustering of omics time-series data, including transcriptomic, proteomic or metabolomic data. It is based on the use of c-means clustering. For convenience, it includes a graphical user interface.

Maintained by Matthias Futschik. Last updated 5 months ago.

microarray clustering timecourse preprocessing visualization

7.64 score 338 scripts 4 dependents

mlysy

nicheROVER:Niche Region and Niche Overlap Metrics for Multidimensional Ecological Niches

Implementation of a probabilistic method to calculate 'nicheROVER' (_niche_ _r_egion and niche _over_lap) metrics using multidimensional niche indicator data (e.g., stable isotopes, environmental variables, etc.). The niche region is defined as the joint probability density function of the multidimensional niche indicators at a user-defined probability alpha (e.g., 95%). Uncertainty is accounted for in a Bayesian framework, and the method can be extended to three or more indicator dimensions. It provides directional estimates of niche overlap, accounts for species-specific distributions in multivariate niche space, and produces unique and consistent bivariate projections of the multivariate niche region. The article by Swanson et al. (2015) <doi:10.1890/14-0235.1> provides a detailed description of the methodology. See the package vignette for a worked example using fish stable isotope data.

Maintained by Martin Lysy. Last updated 1 years ago.

9 stars 6.80 score 47 scripts 1 dependents

jazznbass

scan:Single-Case Data Analyses for Single and Multiple Baseline Designs

A collection of procedures for analysing, visualising, and managing single-case data. These include piecewise linear regression models, multilevel models, overlap indices ('PND', 'PEM', 'PAND', 'PET', 'tau-u', 'baseline corrected tau', 'CDC'), and randomization tests. Data preparation functions support outlier detection, handling missing values, scaling, and custom transformations. An export function helps to generate html, word, and latex tables in a publication friendly style. More details can be found in the online book 'Analyzing single-case data with R and scan', Juergen Wilbert (2025) <https://jazznbass.github.io/scan-Book/>.

Maintained by Juergen Wilbert. Last updated 10 days ago.

4 stars 6.47 score 62 scripts 1 dependents

rickhelmus

patRoon:Workflows for Mass-Spectrometry Based Non-Target Analysis

Provides an easy-to-use interface to a mass spectrometry based non-target analysis workflow. Various (open-source) tools are combined which provide algorithms for extraction and grouping of features, extraction of MS and MS/MS data, automatic formula and compound annotation and grouping related features to components. In addition, various tools are provided for e.g. data preparation and cleanup, plotting results and automatic reporting.

Maintained by Rick Helmus. Last updated 8 days ago.

mass-spectrometry non-target cpp openjdk

65 stars 6.24 score 43 scripts

maarten14c

rice:Radiocarbon Equations

Provides functions for the calibration of radiocarbon dates, as well as options to calculate different radiocarbon realms (C14 age, F14C, pMC, D14C) and estimating the effects of contamination or local reservoir offsets (Reimer and Reimer 2001 <doi:10.1017/S0033822200038339>). The methods follow long-established recommendations such as Stuiver and Polach (1977) <doi:10.1017/S0033822200003672> and Reimer et al. (2004) <doi:10.1017/S0033822200033154>. This package complements the data package 'rintcal'.

Maintained by Maarten Blaauw. Last updated 3 months ago.

1 stars 6.13 score 13 scripts 4 dependents

paulrougieux

FAOSTAT:Download Data from the FAOSTAT Database

Download Data from the FAOSTAT Database of the Food and Agricultural Organization (FAO) of the United Nations. A list of functions to download statistics from FAOSTAT (database of the FAO <https://www.fao.org/faostat/>) and WDI (database of the World Bank <https://data.worldbank.org/>), and to perform some harmonization operations.

Maintained by Paul Rougieux. Last updated 7 months ago.

5.30 score 132 scripts

olisansonwu

diyar:Record Linkage and Epidemiological Case Definitions in 'R'

An R package for iterative and batched record linkage, and applying epidemiological case definitions. 'diyar' can be used for deterministic and probabilistic record linkage, or multistage record linkage combining both approaches. It features the implementation of nested match criteria, and mechanisms to address missing data and conflicting matches during stepwise record linkage. Case definitions are implemented by assigning records to groups based on match criteria such as person or place, and overlapping time or duration of events e.g. sample collection dates or periods of hospital stays. Matching records are assigned a unique group ID. Index and duplicate records are removed or further analyses as required.

Maintained by Olisaeloka Nsonwu. Last updated 3 months ago.

6 stars 4.77 score 33 scripts

snoweye

MixSim:Simulating Data to Study Performance of Clustering Algorithms

The utility of this package is in simulating mixtures of Gaussian distributions with different levels of overlap between mixture components. Pairwise overlap, defined as a sum of two misclassification probabilities, measures the degree of interaction between components and can be readily employed to control the clustering complexity of datasets simulated from mixtures. These datasets can then be used for systematic performance investigation of clustering and finite mixture modeling algorithms. Among other capabilities of 'MixSim', there are computing the exact overlap for Gaussian mixtures, simulating Gaussian and non-Gaussian data, simulating outliers and noise variables, calculating various measures of agreement between two partitionings, and constructing parallel distribution plots for the graphical display of finite mixture models.

Maintained by Wei-Chen Chen. Last updated 9 months ago.

openblas

1 stars 4.48 score 84 scripts 3 dependents

stuartwagenius

mateable:Assess Mating Potential in Space and Time

Simulate, manage, visualize, and analyze spatially and temporally explicit datasets of mating potential. Implements methods to calculate synchrony, proximity, and compatibility.Synchrony calculations are based on methods described in Augspurger (1983) <doi:10.2307/2387650>, Kempenaers (1993) <doi:10.2307/3676415>, Ison et al. (2014) <doi:10.3732/ajb.1300065>, and variations on these, as described.

Maintained by Stuart Wagenius. Last updated 2 years ago.

cpp

4.36 score 23 scripts

glgrabow

spreval:Evaluation of Sprinkler Irrigation Uniformity and Efficiency

Processing and analysis of field collected or simulated sprinkler system catch data (depths) to characterize irrigation uniformity and efficiency using standard and other measures. Standard measures include the Christiansen coefficient of uniformity (CU) as found in Christiansen, J.E.(1942, ISBN:0138779295, "Irrigation by Sprinkling"); and distribution uniformity (DU), potential efficiency of the low quarter (PELQ), and application efficiency of the low quarter (AELQ) that are implementations of measures of the same notation in Keller, J. and Merriam, J.L. (1978) "Farm Irrigation System Evaluation: A Guide for Management" <https://pdf.usaid.gov/pdf_docs/PNAAG745.pdf>. spreval::DU.lh is similar to spreval::DU but is the distribution uniformity of the low half instead of low quarter as in DU. spreval::PELQT is a version of spreval::PELQ adapted for traveling systems instead of lateral move or solid-set sprinkler systems. The function spreval::eff is analogous to the method used to compute application efficiency for furrow irrigation presented in Walker, W. and Skogerboe, G.V. (1987,ISBN:0138779295, "Surface Irrigation: Theory and Practice"),that uses piecewise integration of infiltrated depth compared against soil-moisture deficit (SMD), when the argument "target" is set equal to SMD. The other functions contained in the package provide graphical representation of sprinkler system uniformity, and other standard univariate parametric and non-parametric statistical measures as applied to sprinkler system catch depths. A sample data set of field test data spreval::catchcan (catch depths) is provided and is used in examples and vignettes. Agricultural systems emphasized, but this package can be used for landscape irrigation evaluation, and a landscape (turf) vignette is included as an example application.

Maintained by Garry Grabow. Last updated 3 years ago.

4.30 score 9 scripts

arliph

SPARTAAS:Statistical Pattern Recognition and daTing using Archaeological Artefacts assemblageS

Statistical pattern recognition and dating using archaeological artefacts assemblages. Package of statistical tools for archaeology. hclustcompro(perioclust): Bellanger Lise, Coulon Arthur, Husi Philibrary(SPARTlippe (2021, ISBN:978-3-030-60103-4). mapclust: Bellanger Lise, Coulon Arthur, Husi Philippe (2021) <doi:10.1016/j.jas.2021.105431>. seriograph: Desachy Bruno (2004) <doi:10.3406/pica.2004.2396>. cerardat: Bellanger Lise, Husi Philippe (2012) <doi:10.1016/j.jas.2011.06.031>.

Maintained by Arthur Coulon. Last updated 10 months ago.

6 stars 4.14 score 46 scripts

sbissantz

elisr:Exploratory Likert Scaling

An alternative to Exploratory Factor Analysis (EFA) for metrical data in R. Drawing on characteristics of classical test theory, Exploratory Likert Scaling (ELiS) supports the user exploring multiple one-dimensional data structures. In common research practice, however, EFA remains the go-to method to uncover the (underlying) structure of a data set. Orthogonal dimensions and the potential of overextraction are often accepted as side effects. As described in Müller-Schneider (2001) <doi:10.1515/zfsoz-2001-0404>), ELiS confronts these problems. As a result, 'elisr' provides the platform to fully exploit the exploratory potential of the multiple scaling approach itself.

Maintained by Steven Bißantz. Last updated 4 years ago.

1 stars 3.70 score 4 scripts

bioc

segmenter:Perform Chromatin Segmentation Analysis in R by Calling ChromHMM

Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.

Maintained by Mahmoud Ahmed. Last updated 5 months ago.

software histonemodification bioconductor chromhmm segmentation-an

4 stars 3.60 score 9 scripts

cran

overlapping:Estimation of Overlapping in Empirical Distributions

Functions for estimating the overlapping area of two or more kernel density estimations from empirical data.

Maintained by Massimiliano Pastore. Last updated 3 months ago.

3.40 score 8 dependents

bioc

OrderedList:Similarities of Ordered Gene Lists

Detection of similarities between ordered lists of genes. Thereby, either simple lists can be compared or gene expression data can be used to deduce the lists. Significance of similarities is evaluated by shuffling lists or by resampling in microarray data, respectively.

Maintained by Claudio Lottaz. Last updated 5 months ago.

microarray differentialexpression multiplecomparison

3.30 score 9 scripts

tyakyol

RVenn:Set Operations for Many Sets

Set operations for many sets. The base functions for set operations in R can be used for only two sets. This package uses 'purr' to find the union, intersection and difference of three or more sets. This package also provides functions for pairwise set operations among several sets. Further, based on 'ggplot2' and 'ggforce', a Venn diagram can be drawn for two or three sets. For bigger data sets, a clustered heatmap showing presence/absence of the elements of the sets can be drawn based on the 'pheatmap' package. Finally, enrichment test can be applied to two sets whether an overlap is statistically significant or not.

Maintained by Turgut Yigit Akyol. Last updated 6 years ago.

1 stars 2.99 score 98 scripts

nicola-zaccarelli

RInSp:R Individual Specialization

Functions to calculate several ecological indices of individual and population niche width (Araujo's E, clustering and pairwise similarity among individuals, IS, Petraitis' W, and Roughgarden's WIC/TNW) to assess individual specialization based on data of resource use. Resource use can be quantified by counts of categories, measures of mass or length, or proportions. Monte Carlo resampling procedures are available for hypothesis testing against multinomial null models. Details are provided in Zaccarelli et al. (2013) <doi:10.1111/2041-210X.12079> and associated references.

Maintained by Dr. Nicola Zaccarelli. Last updated 3 years ago.

2.12 score 33 scripts

cran

birdring:Methods to Analyse Ring Re-Encounter Data

R functions to read EURING data and analyse re-encounter data of birds marked by metal rings. For a tutorial, go to <doi:10.1080/03078698.2014.933053>.

Maintained by Fraenzi Korner-Nievergelt. Last updated 1 years ago.

1.30 score

ekstroem

SuperRanker:Sequential Rank Agreement

Tools for analysing the agreement of two or more rankings of the same items. Examples are importance rankings of predictor variables and risk predictions of subjects. Benchmarks for agreement are computed based on random permutation and bootstrap. See Ekstrøm CT, Gerds TA, Jensen, AK (2018). "Sequential rank agreement methods for comparison of ranked lists." _Biostatistics_, *20*(4), 582-598 <doi:10.1093/biostatistics/kxy017> for more information.

Maintained by Claus Thorn Ekstrøm. Last updated 2 years ago.

cpp

1.23 score 17 scripts