R-universe search: segregation

elbersb

segregation:Entropy-Based Segregation Indices

Computes segregation indices, including the Index of Dissimilarity, as well as the information-theoretic indices developed by Theil (1971) <isbn:978-0471858454>, namely the Mutual Information Index (M) and Theil's Information Index (H). The M, further described by Mora and Ruiz-Castillo (2011) <doi:10.1111/j.1467-9531.2011.01237.x> and Frankel and Volij (2011) <doi:10.1016/j.jet.2010.10.008>, is a measure of segregation that is highly decomposable. The package provides tools to decompose the index by units and groups (local segregation), and by within and between terms. The package also provides a method to decompose differences in segregation as described by Elbers (2021) <doi:10.1177/0049124121986204>. The package includes standard error estimation by bootstrapping, which also corrects for small sample bias. The package also contains functions for visualizing segregation patterns.

Maintained by Benjamin Elbers. Last updated 1 years ago.

entropy segregation statistics cpp

94.6 match 36 stars 6.44 score 51 scripts

elvanceyhan

pcds:Proximity Catch Digraphs and Their Applications

Contains the functions for construction and visualization of various families of the proximity catch digraphs (PCDs) (see (Ceyhan (2005) ISBN:978-3-639-19063-2), for computing the graph invariants for testing the patterns of segregation and association against complete spatial randomness (CSR) or uniformity in one, two and three dimensional cases. The package also has tools for generating points from these spatial patterns. The graph invariants used in testing spatial point data are the domination number (Ceyhan (2011) <doi:10.1080/03610921003597211>) and arc density (Ceyhan et al. (2006) <doi:10.1016/j.csda.2005.03.002>; Ceyhan et al. (2007) <doi:10.1002/cjs.5550350106>). The PCD families considered are Arc-Slice PCDs, Proportional-Edge PCDs, and Central Similarity PCDs.

Maintained by Elvan Ceyhan. Last updated 2 years ago.

27.7 match 5.80 score 21 scripts 2 dependents

mbojan

netseg:Measures of Network Segregation and Homophily

Segregation is a network-level property such that edges between predefined groups of vertices are relatively less likely. Network homophily is a individual-level tendency to form relations with people who are similar on some attribute (e.g. gender, music taste, social status, etc.). In general homophily leads to segregation, but segregation might arise without homophily. This package implements descriptive indices measuring homophily/segregation. It is a computational companion to Bojanowski & Corten (2014) <doi:10.1016/j.socnet.2014.04.001>.

Maintained by Michal Bojanowski. Last updated 2 years ago.

homophily segregation social-networks

28.9 match 17 stars 5.08 score 14 scripts

rspatial

terra:Spatial Data Analysis

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

Maintained by Robert J. Hijmans. Last updated 1 days ago.

geospatial raster spatial vector onetbb proj gdal geos cpp

6.0 match 559 stars 17.64 score 17k scripts 851 dependents

elvanceyhan

nnspat:Nearest Neighbor Methods for Spatial Patterns

Contains the functions for testing the spatial patterns (of segregation, spatial symmetry, association, disease clustering, species correspondence and reflexivity) based on nearest neighbor relations, especially using contingency tables such as nearest neighbor contingency tables (Ceyhan (2010) <doi:10.1007/s10651-008-0104-x> and Ceyhan (2017) <doi:10.1016/j.jkss.2016.10.002> and references therein), nearest neighbor symmetry contingency tables (Ceyhan (2014) <doi:10.1155/2014/698296>), species correspondence contingency tables and reflexivity contingency tables (Ceyhan (2018) <doi:10.2436/20.8080.02.72>) for two (or higher) dimensional data. Also contains functions for generating patterns of segregation, association, uniformity in a multi-class setting (Ceyhan (2014) <doi:10.1007/s00477-013-0824-9>), and various non-random labeling patterns for disease clustering in two dimensional cases (Ceyhan (2014) <doi:10.1002/sim.6053>), and for visualization of all these patterns for the two dimensional data. The tests are usually (asymptotic) normal z-tests and chi-square tests.

Maintained by Elvan Ceyhan. Last updated 3 years ago.

36.1 match 2.90 score 16 scripts

yuanmingzhang

SEA:Segregation Analysis

A few major genes and a series of polygene are responsive for each quantitative trait. Major genes are individually identified while polygene is collectively detected. This is mixed major genes plus polygene inheritance analysis or segregation analysis (SEA). In the SEA, phenotypes from a single or multiple bi-parental segregation populations along with their parents are used to fit all the possible models and the best model of the trait for population phenotypic distributions is viewed as the model of the trait. There are fourteen types of population combinations available. Zhang Yuan-Ming, Gai Jun-Yi, Yang Yong-Hua (2003, <doi:10.1017/S0016672303006141>).

Maintained by Yuan-Ming Zhang. Last updated 3 years ago.

33.0 match 2.26 score 18 scripts

idblr

ndi:Neighborhood Deprivation Indices

Computes various geospatial indices of socioeconomic deprivation and disparity in the United States. Some indices are considered "spatial" because they consider the values of neighboring (i.e., adjacent) census geographies in their computation, while other indices are "aspatial" because they only consider the value within each census geography. Two types of aspatial neighborhood deprivation indices (NDI) are available: including: (1) based on Messer et al. (2006) <doi:10.1007/s11524-006-9094-x> and (2) based on Andrews et al. (2020) <doi:10.1080/17445647.2020.1750066> and Slotman et al. (2022) <doi:10.1016/j.dib.2022.108002> who use variables chosen by Roux and Mair (2010) <doi:10.1111/j.1749-6632.2009.05333.x>. Both are a decomposition of multiple demographic characteristics from the U.S. Census Bureau American Community Survey 5-year estimates (ACS-5; 2006-2010 onward). Using data from the ACS-5 (2005-2009 onward), the package can also compute indices of racial or ethnic residential segregation, including but limited to those discussed in Massey & Denton (1988) <doi:10.1093/sf/67.2.281>, and additional indices of socioeconomic disparity.

Maintained by Ian D. Buller. Last updated 7 months ago.

census census-api census-data deprivation deprivation-stats disparity geospatial geospatial-data metric-development principal-component-analysis segregation-measures socio-economic-indicators

11.0 match 21 stars 6.67 score 7 scripts 1 dependents

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 1 months ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

6.6 match 1 stars 10.17 score 67 scripts 148 dependents

nowosad

raceland:Pattern-Based Zoneless Method for Analysis and Visualization of Racial Topography

Implements a computational framework for a pattern-based, zoneless analysis, and visualization of (ethno)racial topography (Dmowska, Stepinski, and Nowosad (2020) <doi:10.1016/j.apgeog.2020.102239>). It is a reimagined approach for analyzing residential segregation and racial diversity based on the concept of 'landscape’ used in the domain of landscape ecology.

Maintained by Jakub Nowosad. Last updated 2 years ago.

information-theory landscape racial-diversity raster residential-segregation spatial cpp

12.9 match 9 stars 5.21 score 12 scripts

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

8.1 match 7.27 score 251 scripts 1 dependents

cristianetaniguti

onemap:Construction of Genetic Maps in Experimental Crosses

Analysis of molecular marker data from model (backcrosses, F2 and recombinant inbred lines) and non-model systems (i. e. outcrossing species). For the later, it allows statistical analysis by simultaneously estimating linkage and linkage phases (genetic map construction) according to Wu et al. (2002) <doi:10.1006/tpbi.2002.1577>. All analysis are based on multipoint approaches using hidden Markov models.

Maintained by Cristiane Taniguti. Last updated 2 months ago.

cpp

9.0 match 3 stars 6.58 score 183 scripts

cran

OasisR:Outright Tool for the Analysis of Spatial Inequalities and Segregation

A comprehensive set of indexes and tests for social segregation analysis, as described in Tivadar (2019) - 'OasisR': An R Package to Bring Some Order to the World of Segregation Measurement <doi:10.18637/jss.v089.i07>. The package is the most complete existing tool and it clarifies many ambiguities and errors regarding the definition of segregation indices. Additionally, 'OasisR' introduces several resampling methods that enable testing their statistical significance (randomization tests, bootstrapping, and jackknife methods).

Maintained by Mihai Tivadar. Last updated 4 months ago.

30.3 match 2 stars 1.78 score 1 dependents

dcgerard

segtest:Tests for Segregation Distortion in Polyploids

Provides a suite of tests for segregation distortion in F1 polyploid populations (for now, just tetraploids). This is under different assumptions of meiosis. Details of these methods are described in Gerard et al. (2025) <doi:10.1007/s00122-025-04816-z>. This material is based upon work supported by the National Science Foundation under Grant No. 2132247. The opinions, findings, and conclusions or recommendations expressed are those of the author and do not necessarily reflect the views of the National Science Foundation.

Maintained by David Gerard. Last updated 2 months ago.

cpp

10.5 match 1 stars 4.85 score 3 scripts

vitomuggeo

segmented:Regression Models with Break-Points / Change-Points Estimation (with Possibly Random Effects)

Fitting regression models where, in addition to possible linear terms, one or more covariates have segmented (i.e., broken-line or piece-wise linear) or stepmented (i.e. piece-wise constant) effects. Multiple breakpoints for the same variable are allowed. The estimation method is discussed in Muggeo (2003, <doi:10.1002/sim.1545>) and illustrated in Muggeo (2008, <https://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf>). An approach for hypothesis testing is presented in Muggeo (2016, <doi:10.1080/00949655.2016.1149855>), and interval estimation for the breakpoint is discussed in Muggeo (2017, <doi:10.1111/anzs.12200>). Segmented mixed models, i.e. random effects in the change point, are discussed in Muggeo (2014, <doi:10.1177/1471082X13504721>). Estimation of piecewise-constant relationships and changepoints (mean-shift models) is discussed in Fasola et al. (2018, <doi:10.1007/s00180-017-0740-4>).

Maintained by Vito M. R. Muggeo. Last updated 17 days ago.

5.0 match 9 stars 10.03 score 1.2k scripts 203 dependents

rafaelfuentealbac

mutualinf:Computation and Decomposition of the Mutual Information Index

The Mutual Information Index (M) introduced to social science literature by Theil and Finizza (1971) <doi:10.1080/0022250X.1971.9989795> is a multigroup segregation measure that is highly decomposable and that according to Frankel and Volij (2011) <doi:10.1016/j.jet.2010.10.008> and Mora and Ruiz-Castillo (2011) <doi:10.1111/j.1467-9531.2011.01237.x> satisfies the Strong Unit Decomposability and Strong Group Decomposability properties. This package allows computing and decomposing the total index value into its "between" and "within" terms. These last terms can also be decomposed into their contributions, either by group or unit characteristics. The factors that produce each "within" term can also be displayed at the user's request. The results can be computed considering a variable or sets of variables that define separate clusters.

Maintained by Cristian Angulo-Gonzalez. Last updated 2 months ago.

openblas cpp openmp

11.8 match 2 stars 4.20 score 7 scripts

pbourkey

polyqtlR:QTL Analysis in Autopolyploid Bi-Parental F1 Populations

Quantitative trait loci (QTL) analysis and exploration of meiotic patterns in autopolyploid bi-parental F1 populations. For all ploidy levels, identity-by-descent (IBD) probabilities can be estimated. Significance thresholds, exploring QTL allele effects and visualising results are provided. For more background and to reference the package see <doi:10.1093/bioinformatics/btab574>.

Maintained by Peter Bourke. Last updated 1 years ago.

cpp

20.6 match 2.30 score 2 scripts

gitumino

fitPoly:Genotype Calling for Bi-Allelic Marker Assays

Genotyping assays for bi-allelic markers (e.g. SNPs) produce signal intensities for the two alleles. 'fitPoly' assigns genotypes (allele dosages) to a collection of polyploid samples based on these signal intensities. 'fitPoly' replaces the older package 'fitTetra' that was limited (a.o.) to only tetraploid populations whereas 'fitPoly' accepts any ploidy level. Reference: Voorrips RE, Gort G, Vosman B (2011) <doi:10.1186/1471-2105-12-172>. New functions added on conversion of data from SNP array software formats, drawing of XY-scatterplots with or without genotype colors, checking against expected F1 segregation patterns, comparing results from two different assays (probes) for the same SNP, recovery from a saveMarkerModels() crash.

Maintained by Giorgio Tumino. Last updated 1 months ago.

10.5 match 4.13 score 15 scripts

petebaker

polySegratioMM:Bayesian Mixture Models for Marker Dosage in Autopolyploids

Fits Bayesian mixture models to estimate marker dosage for dominant markers in autopolyploids using JAGS (1.0 or greater) as outlined in Baker et al "Bayesian estimation of marker dosage in sugarcane and other autopolyploids" (2010, <doi:10.1007/s00122-010-1283-z>). May be used in conjunction with polySegratio for simulation studies and comparison with standard methods.

Maintained by Peter Baker. Last updated 7 years ago.

10.2 match 4.08 score 24 scripts

petebaker

polySegratio:Simulate and Test Marker Dosage for Dominant Markers in Autopolyploids

Perform classic chi-squared tests and Ripol et al(1999) binomial confidence interval approach for autopolyploid dominant markers. Also, dominant markers may be generated for families of offspring where either one or both of the parents possess the marker. Missing values and misclassified markers may be generated at random.

Maintained by Peter Baker. Last updated 7 years ago.

9.2 match 4.56 score 24 scripts 1 dependents

gaynorr

AlphaSimR:Breeding Program Simulations

The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) <doi:10.3835/plantgenome2016.02.0013>]. Used for stochastic simulations of breeding programs to the level of DNA sequence for every individual. Contained is a wide range of functions for modeling common tasks in a breeding program, such as selection and crossing. These functions allow for constructing simulations of highly complex plant and animal breeding programs via scripting in the R software environment. Such simulations can be used to evaluate overall breeding program performance and conduct research into breeding program design, such as implementation of genomic selection. Included is the 'Markovian Coalescent Simulator' ('MaCS') for fast simulation of biallelic sequences according to a population demographic history [Chen et al. (2009) <doi:10.1101/gr.083634.108>].

Maintained by Chris Gaynor. Last updated 5 months ago.

breeding genomics simulation openblas cpp openmp

3.8 match 47 stars 10.22 score 534 scripts 2 dependents

dcgerard

hwep:Hardy-Weinberg Equilibrium in Polyploids

Inference concerning equilibrium and random mating in autopolyploids. Methods are available to test for equilibrium and random mating at any even ploidy level (>2) in the presence of double reduction at biallelic loci. For autopolyploid populations in equilibrium, methods are available to estimate the degree of double reduction. We also provide functions to calculate genotype frequencies at equilibrium, or after one or several rounds of random mating, given rates of double reduction. The main function is hwefit(). This material is based upon work supported by the National Science Foundation under Grant No. 2132247. The opinions, findings, and conclusions or recommendations expressed are those of the author and do not necessarily reflect the views of the National Science Foundation. For details of these methods, see Gerard (2023a) <doi:10.1111/biom.13722> and Gerard (2023b) <doi:10.1111/1755-0998.13856>.

Maintained by David Gerard. Last updated 2 years ago.

cpp

7.3 match 3 stars 4.72 score 35 scripts

ekstroem

MESS:Miscellaneous Esoteric Statistical Scripts

A mixed collection of useful and semi-useful diverse statistical functions, some of which may even be referenced in The R Primer book. See Ekstrøm, C. T. (2016). The R Primer. 2nd edition. Chapman & Hall.

Maintained by Claus Thorn Ekstrøm. Last updated 29 days ago.

biostatistics power-analysis statistical-analysis statistical-methods statistical-models openblas cpp

4.3 match 4 stars 7.76 score 328 scripts 13 dependents

cran

GESE:Gene-Based Segregation Test

Implements the gene-based segregation test(GESE) and the weighted GESE test for identifying genes with causal variants of large effects for family-based sequencing data. The methods are described in Qiao, D. Lange, C., Laird, N.M., Won, S., Hersh, C.P., et al. (2017). <DOI:10.1002/gepi.22037>. Gene-based segregation method for identifying rare variants for family-based sequencing studies. Genet Epidemiol 41(4):309-319. More details can be found at <http://scholar.harvard.edu/dqiao/gese>.

Maintained by Dandi Qiao. Last updated 8 years ago.

15.7 match 2.08 score 12 scripts

eriqande

gscramble:Simulating Admixed Genotypes Without Replacement

A genomic simulation approach for creating biologically informed individual genotypes from empirical data that 1) samples alleles from populations without replacement, 2) segregates alleles based on species-specific recombination rates. 'gscramble' is a flexible simulation approach that allows users to create pedigrees of varying complexity in order to simulate admixed genotypes. Furthermore, it allows users to track haplotype blocks from the source populations through the pedigrees.

Maintained by Eric C. Anderson. Last updated 1 years ago.

noaa-omics-software

6.5 match 4.83 score 15 scripts

emmanuelparadis

ape:Analyses of Phylogenetics and Evolution

Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.

Maintained by Emmanuel Paradis. Last updated 16 hours ago.

openblas cpp

1.8 match 64 stars 17.22 score 13k scripts 599 dependents

statgenlmu

coala:A Framework for Coalescent Simulation

Coalescent simulators can rapidly simulate biological sequences evolving according to a given model of evolution. You can use this package to specify such models, to conduct the simulations and to calculate additional statistics from the results (Staab, Metzler, 2016 <doi:10.1093/bioinformatics/btw098>). It relies on existing simulators for doing the simulation, and currently supports the programs 'ms', 'msms' and 'scrm'. It also supports finite-sites mutation models by combining the simulators with the program 'seq-gen'. Coala provides functions for calculating certain summary statistics, which can also be applied to actual biological data. One possibility to import data is through the 'PopGenome' package (<https://github.com/pievos101/PopGenome>).

Maintained by Dirk Metzler. Last updated 1 years ago.

coalescent dna evolution popgen simulation cpp

4.1 match 23 stars 7.06 score 84 scripts

flr

FLCore:Core Package of FLR, Fisheries Modelling in R

Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.

Maintained by Iago Mosqueira. Last updated 10 days ago.

fisheries flr fisheries-modelling

3.3 match 16 stars 8.78 score 956 scripts 23 dependents

teunbrand

ggh4x:Hacks for 'ggplot2'

A 'ggplot2' extension that does a variety of little helpful things. The package extends 'ggplot2' facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.

Maintained by Teun van den Brand. Last updated 3 months ago.

ggplot-extension ggplot2

2.0 match 616 stars 13.98 score 4.4k scripts 20 dependents

mmollina

mappoly:Genetic Linkage Maps in Autopolyploids

Construction of genetic maps in autopolyploid full-sib populations. Uses pairwise recombination fraction estimation as the first source of information to sequentially position allelic variants in specific homologous chromosomes. For situations where pairwise analysis has limited power, the algorithm relies on the multilocus likelihood obtained through a hidden Markov model (HMM). For more detail, please see Mollinari and Garcia (2019) <doi:10.1534/g3.119.400378> and Mollinari et al. (2020) <doi:10.1534/g3.119.400620>.

Maintained by Marcelo Mollinari. Last updated 11 days ago.

polyploid polyploid-genetic-mapping polyploidy cpp

3.5 match 27 stars 7.56 score 111 scripts 1 dependents

magnusdv

segregatr:Segregation Analysis for Variant Interpretation

An implementation of the full-likelihood Bayes factor (FLB) for evaluating segregation evidence in clinical medical genetics. The method was introduced by Thompson et al. (2003) <doi:10.1086/378100>. This implementation supports custom penetrance values and liability classes, and allows visualisations and robustness analysis as presented in Ratajska et al. (2023) <doi:10.1002/mgg3.2107>. See also the online app 'shinyseg', <https://chrcarrizosa.shinyapps.io/shinyseg>, which offers interactive segregation analysis with many additional features (Carrizosa et al. (2024) <doi:10.1093/bioinformatics/btae201>).

Maintained by Magnus Dehli Vigeland. Last updated 9 months ago.

5.8 match 3 stars 4.03 score 12 scripts 1 dependents

adamlilith

fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'

Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.

Maintained by Adam B. Smith. Last updated 19 days ago.

aspect distance fragmentation fragmentation-indices gis grass grass-gis raster raster-projection rasterize slope topography vectorization

3.0 match 58 stars 7.69 score 8 scripts

barryrowlingson

spatialkernel:Non-Parametric Estimation of Spatial Segregation in a Multivariate Point Process

Edge-corrected kernel density estimation and binary kernel regression estimation for multivariate spatial point process data. For details, see Diggle, P.J., Zheng, P. and Durr, P. A. (2005) <doi:10.1111/j.1467-9876.2005.05373.x>.

Maintained by Virgilio Gómez-Rubio. Last updated 4 years ago.

fortran

8.6 match 2.54 score 23 scripts

pbourkey

polymapR:Linkage Analysis in Outcrossing Polyploids

Creation of linkage maps in polyploid species from marker dosage scores of an F1 cross from two heterozygous parents. Currently works for outcrossing diploid, autotriploid, autotetraploid and autohexaploid species, as well as segmental allotetraploids. Methods are described in a manuscript of Bourke et al. (2018) <doi:10.1093/bioinformatics/bty371>. Since version 1.1.0, both discrete and probabilistic genotypes are acceptable input; for more details on the latter see Liao et al. (2021) <doi:10.1007/s00122-021-03834-x>.

Maintained by Peter Bourke. Last updated 10 months ago.

5.4 match 1 stars 4.03 score 54 scripts

highlanderlab

SIMplyBee:'AlphaSimR' Extension for Simulating Honeybee Populations and Breeding Programmes

An extension of the 'AlphaSimR' package (<https://cran.r-project.org/package=AlphaSimR>) for stochastic simulations of honeybee populations and breeding programmes. 'SIMplyBee' enables simulation of individual bees that form a colony, which includes a queen, fathers (drones the queen mated with), virgin queens, workers, and drones. Multiple colony can be merged into a population of colonies, such as an apiary or a whole country of colonies. Functions enable operations on castes, colony, or colonies, to ease 'R' scripting of whole populations. All 'AlphaSimR' functionality with respect to genomes and genetic and phenotype values is available and further extended for honeybees, including haplo-diploidy, complementary sex determiner locus, colony events (swarming, supersedure, etc.), and colony phenotype values.

Maintained by Jana Obšteter. Last updated 6 months ago.

cpp openmp

3.5 match 2 stars 6.24 score 18 scripts

hojsgaard

gRain:Bayesian Networks

Probability propagation in Bayesian networks, also known as graphical independence networks. Documentation of the package is provided in vignettes included in the package and in the paper by Højsgaard (2012, <doi:10.18637/jss.v046.i10>). See 'citation("gRain")' for details.

Maintained by Søren Højsgaard. Last updated 5 months ago.

cpp

2.3 match 2 stars 9.13 score 408 scripts 8 dependents

bioc

MotifPeeker:Benchmarking Epigenomic Profiling Methods Using Motif Enrichment

MotifPeeker is used to compare and analyse datasets from epigenomic profiling methods with motif enrichment as the key benchmark. The package outputs an HTML report consisting of three sections: (1. General Metrics) Overview of peaks-related general metrics for the datasets (FRiP scores, peak widths and motif-summit distances). (2. Known Motif Enrichment Analysis) Statistics for the frequency of user-provided motifs enriched in the datasets. (3. De-Novo Motif Enrichment Analysis) Statistics for the frequency of de-novo discovered motifs enriched in the datasets and compared with known motifs.

Maintained by Hiranyamaya Dash. Last updated 2 months ago.

epigenetics genetics qualitycontrol chipseq multiplecomparison functionalgenomics motifdiscovery sequencematching software alignment bioconductor bioconductor-package chip-seq epigenomics interactive-report motif-enrichment-analysis

3.6 match 2 stars 5.48 score 6 scripts

txm676

nos:Compute Node Overlap and Segregation in Ecological Networks

Calculate NOS (node overlap and segregation) and the associated metrics described in Strona and Veech (2015) <DOI:10.1111/2041-210X.12395> and Strona et al. (2017, In Press). The functions provided in the package enable assessment of structural patterns ranging from complete node segregation to perfect nestedness in a variety of network types. In addition, they provide a measure of network modularity.

Maintained by Thomas J. Matthews. Last updated 12 months ago.

5.4 match 3.18 score 15 scripts

alarm-redist

redist:Simulation Methods for Legislative Redistricting

Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.

Maintained by Christopher T. Kenny. Last updated 2 months ago.

geospatial gerrymandering redistricting sampling openblas cpp openmp

1.8 match 68 stars 9.17 score 259 scripts

mbojan

isnar:Introduction to Social Network Analysis with R

Functions and datasets accompanying the workshop "Introduction to Social Network Analysis with R" on annual INSNA Sunbelt conferences.

Maintained by Michal Bojanowski. Last updated 4 years ago.

5.7 match 8 stars 2.86 score 18 scripts

bodkan

slendr:A Simulation Framework for Spatiotemporal Population Genetics

A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.

Maintained by Martin Petr. Last updated 12 days ago.

popgen population-genetics simulations spatial-statistics

1.7 match 56 stars 9.15 score 88 scripts

alarm-redist

redistmetrics:Redistricting Metrics

Reliable and flexible tools for scoring redistricting plans using common measures and metrics. These functions provide key direct access to tools useful for non-simulation analyses of redistricting plans, such as for measuring compactness or partisan fairness. Tools are designed to work with the 'redist' package seamlessly.

Maintained by Christopher T. Kenny. Last updated 9 months ago.

openblas cpp

2.0 match 10 stars 7.57 score 23 scripts 2 dependents

jonesor

Rage:Life History Metrics from Matrix Population Models

Functions for calculating life history metrics using matrix population models ('MPMs'). Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.

Maintained by Owen Jones. Last updated 3 months ago.

1.7 match 11 stars 8.17 score 62 scripts 1 dependents

emmanuelparadis

pegas:Population and Evolutionary Genetics Analysis System

Functions for reading, writing, plotting, analysing, and manipulating allelic and haplotypic data, including from VCF files, and for the analysis of population nucleotide sequences and micro-satellites including coalescent analyses, linkage disequilibrium, population structure (Fst, Amova) and equilibrium (HWE), haplotype networks, minimum spanning tree and network, and median-joining networks.

Maintained by Emmanuel Paradis. Last updated 1 years ago.

1.8 match 7.53 score 576 scripts 18 dependents

ices-tools-prod

msy:Estimation of Equilibrium Reference Points for Fsisheries

Methods to estimate equilibrium reference points for fisheries data. Currently data must be converted into FLStock objects of the FLR (Fisheries Library in R) style, defined in the R package FLCore.

Maintained by Colin Millar. Last updated 2 years ago.

3.3 match 13 stars 3.91 score 42 scripts

fridleylab

spatialTIME:Spatial Analysis of Vectra Immunoflourescent Data

Visualization and analysis of Vectra Immunoflourescent data. Options for calculating both the univariate and bivariate Ripley's K are included. Calculations are performed using a permutation-based approach presented by Wilson et al. <doi:10.1101/2021.04.27.21256104>.

Maintained by Fridley Lab. Last updated 7 months ago.

1.9 match 4 stars 6.08 score 30 scripts

bioc

QSutils:Quasispecies Diversity

Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and exploration—functions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indices—functions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulation—functions useful for generating random viral quasispecies data.

Maintained by Mercedes Guerrero-Murillo. Last updated 5 months ago.

software genetics dnaseq geneticvariability sequencing alignment sequencematching dataimport

1.9 match 5.56 score 8 scripts 1 dependents

christopherkenny

divseg:Calculate Diversity and Segregation Indices

Implements common measures of diversity and spatial segregation. This package has tools to compute the majority of measures are reviewed in Massey and Denton (1988) <doi:10.2307/2579183>. Multiple common measures of within-geography diversity are implemented as well. All functions operate on data frames with a 'tidyselect' based workflow.

Maintained by Christopher T. Kenny. Last updated 10 months ago.

3.6 match 1 stars 2.78 score 12 scripts

elvanceyhan

pcds.ugraph:Underlying Graphs of Proximity Catch Digraphs and Their Applications

Contains the functions for construction and visualization of underlying and reflexivity graphs of the three families of the proximity catch digraphs (PCDs) (see (Ceyhan (2005) ISBN:978-3-639-19063-2), and for computing the edge density of these PCD-based graphs which are then used for testing the patterns of segregation and association against complete spatial randomness (CSR)) or uniformity in one and two dimensional cases. The PCD families considered are Arc-Slice PCDs, Proportional-Edge (PE) PCDs (Ceyhan et al. (2006) <doi:10.1016/j.csda.2005.03.002>) and Central Similarity PCDs (Ceyhan et al. (2007) <doi:10.1002/cjs.5550350106>). See also (Ceyhan (2016) <doi:10.1016/j.stamet.2016.07.003> for edge density of the underlying and reflexivity graphs of PE-PCDs. The package also has tools for visualization of PCD-based graphs for one, two, and three dimensional data.

Maintained by Elvan Ceyhan. Last updated 2 years ago.

3.7 match 2.70 score

lucasnell

jackalope:A Swift, Versatile Phylogenomic and High-Throughput Sequencing Simulator

Simply and efficiently simulates (i) variants from reference genomes and (ii) reads from both Illumina <https://www.illumina.com/> and Pacific Biosciences (PacBio) <https://www.pacb.com/> platforms. It can either read reference genomes from FASTA files or simulate new ones. Genomic variants can be simulated using summary statistics, phylogenies, Variant Call Format (VCF) files, and coalescent simulations—the latter of which can include selection, recombination, and demographic fluctuations. 'jackalope' can simulate single, paired-end, or mate-pair Illumina reads, as well as PacBio reads. These simulations include sequencing errors, mapping qualities, multiplexing, and optical/polymerase chain reaction (PCR) duplicates. Simulating Illumina sequencing is based on ART by Huang et al. (2012) <doi:10.1093/bioinformatics/btr708>. PacBio sequencing simulation is based on SimLoRD by Stöcker et al. (2016) <doi:10.1093/bioinformatics/btw286>. All outputs can be written to standard file formats.

Maintained by Lucas A. Nell. Last updated 1 years ago.

zlib openblas curl bzip2 xz-utils cpp

1.7 match 8 stars 5.28 score 24 scripts

spatstat

spatstat:Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests

Comprehensive open-source toolbox for analysing Spatial Point Patterns. Focused mainly on two-dimensional point patterns, including multitype/marked points, in any spatial region. Also supports three-dimensional point patterns, space-time point patterns in any number of dimensions, point patterns on a linear network, and patterns of other geometrical objects. Supports spatial covariate data such as pixel images. Contains over 3000 functions for plotting spatial data, exploratory data analysis, model-fitting, simulation, spatial sampling, model diagnostics, and formal inference. Data types include point patterns, line segment patterns, spatial windows, pixel images, tessellations, and linear networks. Exploratory methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the functions ppm(), kppm(), slrm(), dppm() similar to glm(). Types of models include Poisson, Gibbs and Cox point processes, Neyman-Scott cluster processes, and determinantal point processes. Models may involve dependence on covariates, inter-point interaction, cluster formation and dependence on marks. Models are fitted by maximum likelihood, logistic regression, minimum contrast, and composite likelihood methods. A model can be fitted to a list of point patterns (replicated point pattern data) using the function mppm(). The model can include random effects and fixed effects depending on the experimental design, in addition to all the features listed above. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots.

Maintained by Adrian Baddeley. Last updated 2 months ago.

cluster-process cox-point-process gibbs-process kernel-density network-analysis point-process poisson-process spatial-analysis spatial-data spatial-data-analysis spatial-statistics spatstat statistical-methods statistical-models statistical-tests statistics

0.5 match 200 stars 16.32 score 5.5k scripts 41 dependents

minoo-asty

CINNA:Deciphering Central Informative Nodes in Network Analysis

Computing, comparing, and demonstrating top informative centrality measures within a network. "CINNA: an R/CRAN package to decipher Central Informative Nodes in Network Analysis" provides a comprehensive overview of the package functionality Ashtiani et al. (2018) <doi:10.1093/bioinformatics/bty819>.

Maintained by Minoo Ashtiani. Last updated 2 years ago.

2.4 match 1 stars 3.29 score 98 scripts

rivasiker

PhaseTypeR:General-Purpose Phase-Type Functions

General implementation of core function from phase-type theory. 'PhaseTypeR' can be used to model continuous and discrete phase-type distributions, both univariate and multivariate. The package includes functions for outputting the mean and (co)variance of phase-type distributions; their density, probability and quantile functions; functions for random draws; functions for reward-transformation; and functions for plotting the distributions as networks. For more information on these functions please refer to Bladt and Nielsen (2017, ISBN: 978-1-4939-8377-3) and Campillo Navarro (2019) <https://orbit.dtu.dk/en/publications/order-statistics-and-multivariate-discrete-phase-type-distributio>.

Maintained by Iker Rivas-González. Last updated 2 years ago.

1.1 match 2 stars 5.37 score 39 scripts

andymckenzie

bayesbio:Miscellaneous Functions for Bioinformatics and Bayesian Statistics

A hodgepodge of hopefully helpful functions. Two of these perform shrinkage estimation: one using a simple weighted method where the user can specify the degree of shrinkage required, and one using James-Stein shrinkage estimation for the case of unequal variances.

Maintained by Andrew McKenzie. Last updated 6 years ago.

1.7 match 1 stars 3.18 score 30 scripts

spatstat

spatstat.model:Parametric Statistical Modelling and Inference for the 'spatstat' Family

Functionality for parametric statistical modelling and inference for spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Supports parametric modelling, formal statistical inference, and model validation. Parametric models include Poisson point processes, Cox point processes, Neyman-Scott cluster processes, Gibbs point processes and determinantal point processes. Models can be fitted to data using maximum likelihood, maximum pseudolikelihood, maximum composite likelihood and the method of minimum contrast. Fitted models can be simulated and predicted. Formal inference includes hypothesis tests (quadrat counting tests, Cressie-Read tests, Clark-Evans test, Berman test, Diggle-Cressie-Loosmore-Ford test, scan test, studentised permutation test, segregation test, ANOVA tests of fitted models, adjusted composite likelihood ratio test, envelope tests, Dao-Genton test, balanced independent two-stage test), confidence intervals for parameters, and prediction intervals for point counts. Model validation techniques include leverage, influence, partial residuals, added variable plots, diagnostic plots, pseudoscore residual plots, model compensators and Q-Q plots.

Maintained by Adrian Baddeley. Last updated 8 days ago.

analysis-of-variance cluster-process confidence-intervals cox-process determinantal-point-processes gibbs-process influence leverage model-diagnostics neyman-scott parameter-estimation poisson-process spatial-analysis spatial-modelling spatial-point-processes statistical-inference

0.5 match 5 stars 9.09 score 6 scripts 46 dependents

lvclark

polysat:Tools for Polyploid Microsatellite Analysis

A collection of tools to handle microsatellite data of any ploidy (and samples of mixed ploidy) where allele copy number is not known in partially heterozygous genotypes. It can import and export data in ABI 'GeneMapper', 'Structure', 'ATetra', 'Tetrasat'/'Tetra', 'GenoDive', 'SPAGeDi', 'POPDIST', 'STRand', and binary presence/absence formats. It can calculate pairwise distances between individuals using a stepwise mutation model or infinite alleles model, with or without taking ploidies and allele frequencies into account. These distances can be used for the calculation of clonal diversity statistics or used for further analysis in R. Allelic diversity statistics and Polymorphic Information Content are also available. polysat can assist the user in estimating the ploidy of samples, and it can estimate allele frequencies in populations, calculate pairwise or global differentiation statistics based on those frequencies, and export allele frequencies to 'SPAGeDi' and 'adegenet'. Functions are also included for assigning alleles to isoloci in cases where one pair of microsatellite primers amplifies alleles from two or more independently segregating isoloci. polysat is described by Clark and Jasieniuk (2011) <doi:10.1111/j.1755-0998.2011.02985.x> and Clark and Schreier (2017) <doi:10.1111/1755-0998.12639>.

Maintained by Lindsay V. Clark. Last updated 3 months ago.

cpp

0.5 match 11 stars 7.75 score 72 scripts 1 dependents

cran

SegEnvIneq:Environmental Inequality Indices Based on Segregation Measures

A set of segregation-based indices and randomization methods to make robust environmental inequality assessments, as described in Schaeffer and Tivadar (2019) "Measuring Environmental Inequalities: Insights from the Residential Segregation Literature" <doi:10.1016/j.ecolecon.2019.05.009>.

Maintained by Mihai Tivadar. Last updated 4 months ago.

3.7 match 1.00 score

roelandv

PolyHaplotyper:Assignment of Haplotypes Based on SNP Dosages in Diploids and Polyploids

Infer the genetic composition of individuals in terms of haplotype dosages for a haploblock, based on bi-allelic marker dosages, for any ploidy level. Reference: Voorrips and Tumino: PolyHaplotyper: haplotyping in polyploids based on bi-allelic marker dosage data. Submitted to BMC Bioinformatics (2021).

Maintained by Roeland E. Voorrips. Last updated 4 years ago.

1.2 match 2.00 score 3 scripts

parichit

DCEM:Clustering Big Data using Expectation Maximization Star (EM*) Algorithm

Implements the Improved Expectation Maximisation EM* and the traditional EM algorithm for clustering big data (gaussian mixture models for both multivariate and univariate datasets). This version implements the faster alternative-EM* that expedites convergence via structure based data segregation. The implementation supports both random and K-means++ based initialization. Reference: Parichit Sharma, Hasan Kurban, Mehmet Dalkilic (2022) <doi:10.1016/j.softx.2021.100944>. Hasan Kurban, Mark Jenne, Mehmet Dalkilic (2016) <doi:10.1007/s41060-017-0062-1>.

Maintained by Sharma Parichit. Last updated 3 years ago.

cpp

0.5 match 3 stars 4.48 score 7 scripts

strancsus

scCAN:Single-Cell Clustering using Autoencoder and Network Fusion

A single-cell Clustering method using 'Autoencoder' and Network fusion ('scCAN') Bang Tran (2022) <doi:10.1038/s41598-022-14218-6> for segregating the cells from the high-dimensional 'scRNA-Seq' data. The software automatically determines the optimal number of clusters and then partitions the cells in a way such that the results are robust to noise and dropouts. 'scCAN' is fast and it supports Windows, Linux, and Mac OS.

Maintained by Bang Tran. Last updated 9 months ago.

0.5 match 2.70 score

mngar

simulMGF:Simulate SNP Matrix, Phenotype and Genotypic Effects

Simulate genotypes in SNP (single nucleotide polymorphisms) Matrix as random numbers from an uniform distribution, for diploid organisms (coded by 0, 1, 2), Sikorska et al., (2013) <doi:10.1186/1471-2105-14-166>, or half-sib/full-sib SNP matrix from real or simulated parents SNP data, assuming mendelian segregation. Simulate phenotypic traits for real or simulated SNP data, controlled by a specific number of quantitative trait loci and their effects, sampled from a Normal or an Uniform distributions, assuming a pure additive model. This is useful for testing association and genomic prediction models or for educational purposes.

Maintained by Martin Nahuel Garcia. Last updated 2 years ago.

0.5 match 2.70 score 1 scripts

yuanmingzhang

dQTG.seq:A BSA Software for Detecting All Types of QTLs in BC, DH, RIL and F2

The new (dQTG.seq1 and dQTG.seq2) and existing (SmoothLOD, G', deltaSNP and ED) bulked segregant analysis methods are used to identify various types of quantitative trait loci for complex traits via extreme phenotype individuals in bi-parental segregation populations (F2, backcross, doubled haploid and recombinant inbred line). The numbers of marker alleles in extreme low and high pools are used in existing methods to identify trait-related genes, while the numbers of marker alleles and genotypes in extreme low and high pools are used in the new methods to construct a new statistic Gw for identifying trait-related genes. dQTG-seq2 is feasible to identify extremely over-dominant and small-effect genes in F2. Li P, Li G, Zhang YW, Zuo JF, Liu JY, Zhang YM (2022, <doi: 10.1016/j.xplc.2022.100319>).

Maintained by Yuan-Ming Zhang. Last updated 2 years ago.

0.8 match 1 stars 1.00 score

cran

dixon:Nearest Neighbour Contingency Table Analysis

Function to test spatial segregation and association based in contingency table analysis of nearest neighbour counts following Dixon (2002) <doi:10.1080/11956860.2002.11682700>. Some 'Fortran' code has been included to the original dixon2002() function of the 'ecespa' package to improve speed.

Maintained by Marcelino de la Cruz Rot. Last updated 1 years ago.

fortran

0.5 match 1.48 score 1 dependents

cran

FamEvent:Family Age-at-Onset Data Simulation and Penetrance Estimation

Simulates age-at-onset traits associated with a segregating major gene in family data obtained from population-based, clinic-based, or multi-stage designs. Appropriate ascertainment correction is utilized to estimate age-dependent penetrance functions either parametrically from the fitted model or nonparametrically from the data. The Expectation and Maximization algorithm can infer missing genotypes and carrier probabilities estimated from family's genotype and phenotype information or from a fitted model. Plot functions include pedigrees of simulated families and predicted penetrance curves based on specified parameter values. For more information see Choi, Y.-H., Briollais, L., He, W. and Kopciuk, K. (2021) FamEvent: An R Package for Generating and Modeling Time-to-Event Data in Family Designs, Journal of Statistical Software 97 (7), 1-30.

Maintained by Yun-Hee Choi. Last updated 9 months ago.

0.5 match 1.30 score