R-universe search: relatedness

r-computing-lab

BGmisc:An R Package for Extended Behavior Genetics Analysis

Provides functions for behavior genetics analysis, including variance component model identification [Hunter et al. (2021) <doi:10.1007/s10519-021-10055-x>], calculation of relatedness coefficients using path-tracing methods [Wright (1922) <doi:10.1086/279872>; McArdle & McDonald (1984) <doi:10.1111/j.2044-8317.1984.tb00802.x>], inference of relatedness, pedigree conversion, and simulation of multi-generational family data [Lyu et al. (2024) <doi:10.1101/2024.12.19.629449>]. For a full overview, see Garrison et al. (2024) <doi:10.21105/joss.06203>.

Maintained by S. Mason Garrison. Last updated 26 days ago.

behavior-genetics

38.3 match 1 stars 6.83 score 35 scripts

fabienlaporte

Relatedness:Maximum Likelihood Estimation of Relatedness using EM Algorithm

Inference of relatedness coefficients from a bi-allelic genotype matrix using a Maximum Likelihood estimation, Laporte, F., Charcosset, A. and Mary-Huard, T. (2017) <doi:10.1111/biom.12634>.

Maintained by Fabien Laporte. Last updated 7 years ago.

64.6 match 2.04 score 11 scripts

bioc

GENESIS:GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness

The GENESIS package provides methodology for estimating, inferring, and accounting for population and pedigree structure in genetic analyses. The current implementation provides functions to perform PC-AiR (Conomos et al., 2015, Gen Epi) and PC-Relate (Conomos et al., 2016, AJHG). PC-AiR performs a Principal Components Analysis on genome-wide SNP data for the detection of population structure in a sample that may contain known or cryptic relatedness. Unlike standard PCA, PC-AiR accounts for relatedness in the sample to provide accurate ancestry inference that is not confounded by family structure. PC-Relate uses ancestry representative principal components to adjust for population structure/ancestry and accurately estimate measures of recent genetic relatedness such as kinship coefficients, IBD sharing probabilities, and inbreeding coefficients. Additionally, functions are provided to perform efficient variance component estimation and mixed model association testing for both quantitative and binary phenotypes.

Maintained by Stephanie M. Gogarten. Last updated 2 months ago.

snp geneticvariability genetics statisticalmethod dimensionreduction principalcomponent genomewideassociation qualitycontrol biocviews

12.2 match 36 stars 10.44 score 342 scripts 1 dependents

magnusdv

ribd:Pedigree-based Relatedness Coefficients

Recursive algorithms for computing various relatedness coefficients, including pairwise kinship, kappa and identity coefficients. Both autosomal and X-linked coefficients are computed. Founders are allowed to be inbred, which enables construction of any given kappa coefficients, as described in Vigeland (2020) <doi:10.1007/s00285-020-01505-x>. In addition to the standard coefficients, 'ribd' also computes a range of lesser-known coefficients, including generalised kinship coefficients, multi-person coefficients and two-locus coefficients (Vigeland, 2023, <doi:10.1093/g3journal/jkac326>). Many features of 'ribd' are available through the online app 'QuickPed' at <https://magnusdv.shinyapps.io/quickped>; see Vigeland (2022) <doi:10.1186/s12859-022-04759-y>.

Maintained by Magnus Dehli Vigeland. Last updated 1 months ago.

inbreeding-coefficient kinship pedigree-analysis relatedness

19.4 match 6 stars 5.95 score 10 scripts 11 dependents

paballand

EconGeo:Computing Key Indicators of the Spatial Distribution of Economic Activities

Functions to compute a series of indices commonly used in the fields of economic geography, economic complexity, and evolutionary economics to describe the location, distribution, spatial organization, structure, and complexity of economic activities. Functions include basic spatial indicators such as the location quotient, the Krugman specialization index, the Herfindahl or the Shannon entropy indices but also more advanced functions to compute different forms of normalized relatedness between economic activities or network-based measures of economic complexity. Most of the functions use matrix calculus and are based on bipartite (incidence) matrices consisting of region - industry pairs.

Maintained by Pierre-Alexandre Balland. Last updated 2 years ago.

23.3 match 41 stars 4.96 score 44 scripts

bioc

SNPRelate:Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data

Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.

Maintained by Xiuwen Zheng. Last updated 5 months ago.

infrastructure genetics statisticalmethod principalcomponent bioinformatics gds-format pca simd snp openblas cpp

6.5 match 104 stars 12.69 score 1.6k scripts 18 dependents

magnusdv

ibdsim2:Simulation of Chromosomal Regions Shared by Family Members

Simulation of segments shared identical-by-descent (IBD) by pedigree members. Using sex specific recombination rates along the human genome (Halldorsson et al. (2019) <doi:10.1126/science.aau1043>), phased chromosomes are simulated for all pedigree members. Applications include calculation of realised relatedness coefficients and IBD segment distributions. 'ibdsim2' is part of the 'pedsuite' collection of packages for pedigree analysis. A detailed presentation of the 'pedsuite', including a separate chapter on 'ibdsim2', is available in the book 'Pedigree analysis in R' (Vigeland, 2021, ISBN:9780128244302). A 'Shiny' app for visualising and comparing IBD distributions is available at <https://magnusdv.shinyapps.io/ibdsim2-shiny/>.

Maintained by Magnus Dehli Vigeland. Last updated 15 days ago.

identity-by-descent relatedness simulation cpp

14.5 match 5 stars 5.00 score 19 scripts 1 dependents

eppicenter

dcifer:Genetic Relatedness Between Polyclonal Infections

An implementation of Dcifer (Distance for complex infections: fast estimation of relatedness), an identity by descent (IBD) based method to calculate genetic relatedness between polyclonal infections from biallelic and multiallelic data. The package includes functions that format and preprocess the data, implement the method, and visualize the results. Gerlovina et al. (2022) <doi:10.1093/genetics/iyac126>.

Maintained by Inna Gerlovina. Last updated 10 months ago.

13.5 match 5 stars 4.57 score 15 scripts

jiscah

sequoia:Pedigree Inference from SNPs

Multi-generational pedigree inference from incomplete data on hundreds of SNPs, including parentage assignment and sibship clustering. See Huisman (2017) (<DOI:10.1111/1755-0998.12665>) for more information.

Maintained by Jisca Huisman. Last updated 9 months ago.

pedigree pedigree-reconstruction pedigrees sequoia snp snp-data fortran

7.0 match 26 stars 7.40 score 79 scripts

magnusdv

verbalisr:Describe Pedigree Relationships in Words

Describe in words the genealogical relationship between two members of a given pedigree, using the algorithm in Vigeland (2022) <doi:10.1186/s12859-022-04759-y>. 'verbalisr' is part of the 'pedsuite' collection of packages for pedigree analysis. For a demonstration of 'verbalisr', see the online app 'QuickPed' at <https://magnusdv.shinyapps.io/quickped>.

Maintained by Magnus Dehli Vigeland. Last updated 3 days ago.

pedigree-analysis relatedness relationship-detection

10.0 match 1 stars 4.92 score 7 scripts 8 dependents

magnusdv

forrel:Forensic Pedigree Analysis and Relatedness Inference

Forensic applications of pedigree analysis, including likelihood ratios for relationship testing, general relatedness inference, marker simulation, and power analysis. 'forrel' is part of the 'pedsuite', a collection of packages for pedigree analysis, further described in the book 'Pedigree Analysis in R' (Vigeland, 2021, ISBN:9780128244302). Several functions deal specifically with power analysis in missing person cases, implementing methods described in Vigeland et al. (2020) <doi:10.1016/j.fsigen.2020.102376>. Data import from the 'Familias' software (Egeland et al. (2000) <doi:10.1016/S0379-0738(00)00147-X>) is supported through the 'pedFamilias' package.

Maintained by Magnus Dehli Vigeland. Last updated 6 days ago.

5.5 match 11 stars 6.98 score 63 scripts 7 dependents

matthewwolak

nadiv:(Non)Additive Genetic Relatedness Matrices

Constructs (non)additive genetic relationship matrices, and their inverses, from a pedigree to be used in linear mixed effect models (A.K.A. the 'animal model'). Also includes other functions to facilitate the use of animal models. Some functions have been created to be used in conjunction with the R package 'asreml' for the 'ASReml' software, which can be obtained upon purchase from 'VSN' international (<https://vsni.co.uk/software/asreml>).

Maintained by Matthew Wolak. Last updated 10 months ago.

cpp

4.7 match 20 stars 7.13 score 151 scripts 3 dependents

marsicofl

mispitools:Missing Person Identification Tools

An open source software package written in R statistical language. It consists of a set of decision-making tools to conduct missing person searches. Particularly, it allows computing optimal LR threshold for declaring potential matches in DNA-based database search. More recently 'mispitools' incorporates preliminary investigation data based LRs. Statistical weight of different traces of evidence such as biological sex, age and hair color are presented. For citing mispitools please use the following references: Marsico and Caridi, 2023 <doi:10.1016/j.fsigen.2023.102891> and Marsico, Vigeland et al. 2021 <doi:10.1016/j.fsigen.2021.102519>.

Maintained by Franco Marsico. Last updated 3 months ago.

4.8 match 35 stars 6.74 score 19 scripts 1 dependents

kenhanscombe

ukbtools:Manipulate and Explore UK Biobank Data

A set of tools to create a UK Biobank <http://www.ukbiobank.ac.uk/> dataset from a UKB fileset (.tab, .r, .html), visualize primary demographic data for a sample subset, query ICD diagnoses, retrieve genetic metadata, read and write standard file formats for genetic analyses.

Maintained by Ken Hanscombe. Last updated 2 years ago.

biobank kcl-sgu uk-biobank ukb

4.0 match 101 stars 6.78 score 118 scripts

mikldk

DNAtools:Tools for Analysing Forensic Genetic DNA Data

Computationally efficient tools for comparing all pairs of profiles in a DNA database. The expectation and covariance of the summary statistic is implemented for fast computing. Routines for estimating proportions of close related individuals are available. The use of wildcards (also called F- designation) is implemented. Dedicated functions ease plotting the results. See Tvedebrink et al. (2012) <doi:10.1016/j.fsigen.2011.08.001>. Compute the distribution of the numbers of alleles in DNA mixtures. See Tvedebrink (2013) <doi:10.1016/j.fsigss.2013.10.142>.

Maintained by Mikkel Meyer Andersen. Last updated 2 years ago.

cpp

4.1 match 6.00 score 28 scripts

bowenwang7

rres:Realized Relatedness Estimation and Simulation

Functions for studying realized genetic relatedness between people. Users will be able to simulate inheritance patterns given pedigree structures, generate SNP marker data given inheritance patterns, and estimate realized relatedness between pairs of individuals using SNP marker data. See Wang (2017) <doi:10.1534/genetics.116.197004>. This work was supported by National Institutes of Health grants R37 GM-046255.

Maintained by Bowen Wang. Last updated 7 years ago.

cpp

7.9 match 2.95 score 18 scripts

cran

iCAMP:Infer Community Assembly Mechanisms by Phylogenetic-Bin-Based Null Model Analysis

To implement a general framework to quantitatively infer Community Assembly Mechanisms by Phylogenetic-bin-based null model analysis, abbreviated as 'iCAMP' (Ning et al 2020) <doi:10.1038/s41467-020-18560-z>. It can quantitatively assess the relative importance of different community assembly processes, such as selection, dispersal, and drift, for both communities and each phylogenetic group ('bin'). Each bin usually consists of different taxa from a family or an order. The package also provides functions to implement some other published methods, including neutral taxa percentage (Burns et al 2016) <doi:10.1038/ismej.2015.142> based on neutral theory model and quantifying assembly processes based on entire-community null models ('QPEN', Stegen et al 2013) <doi:10.1038/ismej.2013.93>. It also includes some handy functions, particularly for big datasets, such as phylogenetic and taxonomic null model analysis at both community and bin levels, between-taxa niche difference and phylogenetic distance calculation, phylogenetic signal test within phylogenetic groups, midpoint root of big trees, etc. Version 1.3.x mainly improved the function for 'QPEN' and added function 'icamp.cate()' to summarize 'iCAMP' results for different categories of taxa (e.g. core versus rare taxa).

Maintained by Daliang Ning. Last updated 3 years ago.

10.1 match 2 stars 2.26 score 1 dependents

eppicenter

moire:Multiplicity of Infection and Allele Frequency Recovery from Noisy Polyallelic Genetics Data

A Markov Chain Monte Carlo (MCMC) based approach to Bayesian estimation of individual level multiplicity of infection, within host relatedness, and population allele frequencies from polyallelic genetic data.

Maintained by Maxwell Murphy. Last updated 4 months ago.

genomics malaria mcmc cpp openmp

4.3 match 7 stars 5.14 score 22 scripts

piyalkarum

rCNV:Detect Copy Number Variants from SNPs Data

Functions in this package will import filtered variant call format (VCF) files of SNPs data and generate data sets to detect copy number variants, visualize them and do downstream analyses with copy number variants(e.g. Environmental association analyses).

Maintained by Piyal Karunarathne. Last updated 14 days ago.

cnv-analysis copy-number-variation gene-duplication genetics genomics landscape-genetics snps cpp

5.0 match 6 stars 4.26 score 4 scripts

highlanderlab

SIMplyBee:'AlphaSimR' Extension for Simulating Honeybee Populations and Breeding Programmes

An extension of the 'AlphaSimR' package (<https://cran.r-project.org/package=AlphaSimR>) for stochastic simulations of honeybee populations and breeding programmes. 'SIMplyBee' enables simulation of individual bees that form a colony, which includes a queen, fathers (drones the queen mated with), virgin queens, workers, and drones. Multiple colony can be merged into a population of colonies, such as an apiary or a whole country of colonies. Functions enable operations on castes, colony, or colonies, to ease 'R' scripting of whole populations. All 'AlphaSimR' functionality with respect to genomes and genetic and phenotype values is available and further extended for honeybees, including haplo-diploidy, complementary sex determiner locus, colony events (swarming, supersedure, etc.), and colony phenotype values.

Maintained by Jana Obšteter. Last updated 6 months ago.

cpp openmp

3.3 match 2 stars 6.24 score 18 scripts

bioc

GWASTools:Tools for Genome Wide Association Studies

Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.

Maintained by Stephanie M. Gogarten. Last updated 5 months ago.

snp geneticvariability qualitycontrol microarray

1.9 match 17 stars 10.50 score 396 scripts 5 dependents

magnusdv

paramlink:Parametric Linkage and Other Pedigree Analysis in R

NOTE: 'PARAMLINK' HAS BEEN SUPERSEDED BY THE 'PED SUITE' PACKAGES (<https://magnusdv.github.io/pedsuite/>). 'PARAMLINK' IS MAINTAINED ONLY FOR LEGACY PURPOSES AND SHOULD NOT BE USED IN NEW PROJECTS. A suite of tools for analysing pedigrees with marker data, including parametric linkage analysis, forensic computations, relatedness analysis and marker simulations. The core of the package is an implementation of the Elston-Stewart algorithm for pedigree likelihoods, extended to allow mutations as well as complex inbreeding. Features for linkage analysis include singlepoint LOD scores, power analysis, and multipoint analysis (the latter through a wrapper to the 'MERLIN' software). Forensic applications include exclusion probabilities, genotype distributions and conditional simulations. Data from the 'Familias' software can be imported and analysed in 'paramlink'. Finally, 'paramlink' offers many utility functions for creating, manipulating and plotting pedigrees with or without marker data (the actual plotting is done by the 'kinship2' package).

Maintained by Magnus Dehli Vigeland. Last updated 3 years ago.

5.0 match 3.87 score 49 scripts 1 dependents

bioc

SeqArray:Data Management of Large-Scale Whole-Genome Sequence Variant Calls

Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.

Maintained by Xiuwen Zheng. Last updated 11 days ago.

infrastructure datarepresentation sequencing genetics bioinformatics gds-format snp snv wes wgs cpp

1.5 match 45 stars 12.08 score 1.1k scripts 9 dependents

jonotuke

BREADR:Estimates Degrees of Relatedness (Up to the Second Degree) for Extreme Low-Coverage Data

The goal of the package is to provide an easy-to-use method for estimating degrees of relatedness (up to the second degree) for extreme low-coverage data. The package also allows users to quantify and visualise the level of confidence in the estimated degrees of relatedness.

Maintained by Jono Tuke. Last updated 25 days ago.

3.6 match 8 stars 4.45 score 4 scripts

jarrodhadfield

MCMCglmm:MCMC Generalised Linear Mixed Models

Fits Multivariate Generalised Linear Mixed Models (and related models) using Markov chain Monte Carlo techniques (Hadfield 2010 J. Stat. Soft.).

Maintained by Jarrod Hadfield. Last updated 3 months ago.

cpp

1.8 match 2 stars 8.83 score 1.2k scripts 13 dependents

bioc

LymphoSeq:Analyze high-throughput sequencing of T and B cell receptors

This R package analyzes high-throughput sequencing of T and B cell receptor complementarity determining region 3 (CDR3) sequences generated by Adaptive Biotechnologies' ImmunoSEQ assay. Its input comes from tab-separated value (.tsv) files exported from the ImmunoSEQ analyzer.

Maintained by David Coffey. Last updated 5 months ago.

software technology sequencing targetedresequencing alignment multiplesequencealignment

3.6 match 4.00 score 4 scripts

wtkr

heritability:Marker-Based Estimation of Heritability Using Individual Plant or Plot Data

Implements marker-based estimation of heritability when observations on genetically identical replicates are available. These can be either observations on individual plants or plot-level data in a field trial. Heritability can then be estimated using a mixed model for the individual plant or plot data. For comparison, also mixed-model based estimation using genotypic means and estimation of repeatability with ANOVA are implemented. For illustration the package contains several datasets for the model species Arabidopsis thaliana.

Maintained by Willem Kruijer. Last updated 2 years ago.

6.8 match 2 stars 1.90 score 40 scripts

pbreheny

plmmr:Penalized Linear Mixed Models for Correlated Data

Fits penalized linear mixed models that correct for unobserved confounding factors. 'plmmr' infers and corrects for the presence of unobserved confounding effects such as population stratification and environmental heterogeneity. It then fits a linear model via penalized maximum likelihood. Originally designed for the multivariate analysis of single nucleotide polymorphisms (SNPs) measured in a genome-wide association study (GWAS), 'plmmr' eliminates the need for subpopulation-specific analyses and post-analysis p-value adjustments. Functions for the appropriate processing of 'PLINK' files are also supplied. For examples, see the package homepage. <https://pbreheny.github.io/plmmr/>.

Maintained by Patrick J. Breheny. Last updated 12 days ago.

cpp openmp

2.0 match 4 stars 6.31 score 10 scripts

bioc

SeqSQC:A bioconductor package for sample quality check with next generation sequencing data

The SeqSQC is designed to identify problematic samples in NGS data, including samples with gender mismatch, contamination, cryptic relatedness, and population outlier.

Maintained by Qian Liu. Last updated 5 months ago.

experiment data homo_sapiens_data sequencing data project1000genomes genome

2.3 match 5.38 score 2 scripts

bioc

minet:Mutual Information NETworks

This package implements various algorithms for inferring mutual information networks from data.

Maintained by Patrick E. Meyer. Last updated 5 months ago.

microarray graphandnetwork network networkinference cpp

1.9 match 6.15 score 114 scripts 16 dependents

evolandeco

evesim:Evolution Emulator: Species Diversification under an Evolutionary Relatedness Dependent Scenario

Evolutionary relatedness dependent diversification simulation powered by the 'Rcpp' back-end 'SimTable'.

Maintained by Tianjian Qin. Last updated 3 days ago.

cpp

3.4 match 3.40 score 5 scripts

sales-lab

parmigene:Parallel Mutual Information Estimation for Gene Network Reconstruction

Parallel estimation of the mutual information based on entropy estimates from k-nearest neighbors distances and algorithms for the reconstruction of gene regulatory networks (Sales et al, 2011 <doi:10.1093/bioinformatics/btr274>).

Maintained by Gabriele Sales. Last updated 5 months ago.

openmp

1.9 match 5 stars 6.06 score 38 scripts 4 dependents

bioc

GeneticsPed:Pedigree and genetic relationship functions

Classes and methods for handling pedigree data. It also includes functions to calculate genetic relationship measures as relationship and inbreeding coefficients and other utilities. Note that package is not yet stable. Use it with care!

Maintained by David Henderson. Last updated 5 months ago.

genetics fortran cpp

2.9 match 3.86 score 12 scripts

aursiber

LDcorSV:Linkage Disequilibrium Corrected by the Structure and the Relatedness

Four measures of linkage disequilibrium are provided: the usual r^2 measure, the r^2_S measure (r^2 corrected by the structure sample), the r^2_V (r^2 corrected by the relatedness of genotyped individuals), the r^2_VS measure (r^2 corrected by both the relatedness of genotyped individuals and the structure of the sample).

Maintained by Aurélie Siberchicot. Last updated 5 years ago.

3.8 match 2.81 score 13 scripts

rdinnager

slimr:Create, Run and Post-Process 'SLiM' Population Genetics Forward Simulations

Lets you write 'SLiM' scripts (population genomics simulation) using your favourite R IDE, using a syntax as close as possible to the original 'SLiM' language. It offer many tools to manipulate those scripts, as well as run them in the 'SLiM' software from R, as well as capture and post-process their output, after or even during a simulation.

Maintained by Russell Dinnage. Last updated 4 months ago.

2.0 match 8 stars 4.70 score 42 scripts

jangraffelman

Jacquard:Estimation of Jacquard's Genetic Identity Coefficients

Contains procedures to estimate the nine condensed Jacquard genetic identity coefficients (Jacquard, 1974) <doi:10.1007/978-3-642-88415-3> by constrained least squares (Graffelman et al., 2024) <doi:10.1101/2024.03.25.586682> and by the method of moments (Csuros, 2014) <doi:10.1016/j.tpb.2013.11.001>. These procedures require previous estimation of the allele frequencies. Functions are supplied that estimate relationship parameters that derive from the Jacquard coefficients, such as individual inbreeding coefficients and kinship coefficients.

Maintained by Jan Graffelman. Last updated 6 months ago.

4.6 match 2.00 score

fboehm

gemma2:GEMMA Multivariate Linear Mixed Model

Fits a multivariate linear mixed effects model that uses a polygenic term, after Zhou & Stephens (2014) (<https://www.nature.com/articles/nmeth.2848>). Of particular interest is the estimation of variance components with restricted maximum likelihood (REML) methods. Genome-wide efficient mixed-model association (GEMMA), as implemented in the package 'gemma2', uses an expectation-maximization algorithm for variance components inference for use in quantitative trait locus studies.

Maintained by Frederick Boehm. Last updated 4 years ago.

em-algorithm genetics mixed-models

1.7 match 13 stars 5.29 score 10 scripts 1 dependents

green-striped-gecko

dartR.captive:Analysing 'SNP' Data to Support Captive Breeding

Functions are provided that facilitate the analysis of SNP (single nucleotide polymorphism) data to answer questions regarding captive breeding and relatedness between individuals. 'dartR.captive' is part of the 'dartRverse' suit of packages. Gruber et al. (2018) <doi:10.1111/1755-0998.12745>. Mijangos et al. (2022) <doi:10.1111/2041-210X.13918>.

Maintained by Bernd Gruber. Last updated 28 days ago.

4.5 match 1 stars 2.00 score 3 scripts

emilmip

LTFHPlus:Implementation of LT-FH++

Implementation of LT-FH++, an extension of the liability threshold family history (LT-FH) model. LT-FH++ uses a Gibbs sampler for sampling from the truncated multivariate normal distribution and allows for flexible family structures. LT-FH++ was first described in Pedersen, Emil M., et al. (2022) <https://pure.au.dk/ws/portalfiles/portal/353346245/> as an extension to LT-FH with more flexible family structures, and again as the age-dependent liability threshold (ADuLT) model Pedersen, Emil M., et al. (2023) <https://www.nature.com/articles/s41467-023-41210-z> as an alternative to traditional time-to-event genome-wide association studies, where family history was not considered.

Maintained by Emil Michael Pedersen. Last updated 9 months ago.

cpp

1.9 match 10 stars 4.66 score 23 scripts

wviechtb

metafor:Meta-Analysis Package for R

A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.

Maintained by Wolfgang Viechtbauer. Last updated 3 days ago.

meta-analysis mixed-effects multilevel-models multivariate

0.5 match 246 stars 16.30 score 4.9k scripts 92 dependents

bioc

MetNet:Inferring metabolic networks from untargeted high-resolution mass spectrometry data

MetNet contains functionality to infer metabolic network topologies from quantitative data and high-resolution mass/charge information. Using statistical models (including correlation, mutual information, regression and Bayes statistics) and quantitative data (intensity values of features) adjacency matrices are inferred that can be combined to a consensus matrix. Mass differences calculated between mass/charge values of features will be matched against a data frame of supplied mass/charge differences referring to transformations of enzymatic activities. In a third step, the two levels of information are combined to form a adjacency matrix inferred from both quantitative and structure information.

Maintained by Thomas Naake. Last updated 5 months ago.

immunooncology metabolomics massspectrometry network regression

1.7 match 4.70 score 1 scripts

evolecolgroup

tidypopgen:Tidy Population Genetics

We provide a tidy grammar of population genetics, facilitating the manipulation and analysis of data on biallelic single nucleotide polymorphisms (SNPs).

Maintained by Andrea Manica. Last updated 5 days ago.

openblas zlib cpp openmp

1.3 match 4 stars 5.83 score 8 scripts

juliengamartin

pedtricks:Visualize, Summarize and Simulate Data from Pedigrees

Sensitivity and power analysis, for calculating statistics describing pedigrees from wild populations, and for visualizing pedigrees. This is a reboot of the methods developped by Morrissey and Wilson (2010) <doi: 10.1111/j.1755-0998.2009.02817.x>

Maintained by Julien Martin. Last updated 6 months ago.

1.8 match 2 stars 4.08 score 1 scripts

cran

cluster.datasets:Cluster Analysis Data Sets

A collection of data sets for teaching cluster analysis.

Maintained by Frederick Novomestky. Last updated 11 years ago.

3.5 match 2.00 score

mhunter1

EasyMx:Easy Model-Builder Functions for 'OpenMx'

Utilities for building certain kinds of common matrices and models in the extended structural equation modeling package, 'OpenMx'.

Maintained by Michael D. Hunter. Last updated 2 years ago.

2.0 match 2.32 score 21 scripts

tpook92

MoBPS:Modular Breeding Program Simulator

Framework for the simulation framework for the simulation of complex breeding programs and compare their economic and genetic impact. The package is also used as the background simulator for our a web-based interface <http:www.mobps.de>. Associated publication: Pook et al. (2020) <doi:10.1534/g3.120.401193>.

Maintained by Torsten Pook. Last updated 3 years ago.

1.9 match 2.35 score 45 scripts

gbradburd

conStruct:Models Spatially Continuous and Discrete Population Genetic Structure

A method for modeling genetic data as a combination of discrete layers, within each of which relatedness may decay continuously with geographic distance. This package contains code for running analyses (which are implemented in the modeling language 'rstan') and visualizing and interpreting output. See the paper for more details on the model and its utility.

Maintained by Gideon Bradburd. Last updated 1 years ago.

cpp

0.5 match 35 stars 8.39 score 70 scripts

hanchenphd

GMMAT:Generalized Linear Mixed Model Association Tests

Perform association tests using generalized linear mixed models (GLMMs) in genome-wide association studies (GWAS) and sequencing association studies. First, GMMAT fits a GLMM with covariate adjustment and random effects to account for population structure and familial or cryptic relatedness. For GWAS, GMMAT performs score tests for each genetic variant as proposed in Chen et al. (2016) <DOI:10.1016/j.ajhg.2016.02.012>. For candidate gene studies, GMMAT can also perform Wald tests to get the effect size estimate for each genetic variant. For rare variant analysis from sequencing association studies, GMMAT performs the variant Set Mixed Model Association Tests (SMMAT) as proposed in Chen et al. (2019) <DOI:10.1016/j.ajhg.2018.12.012>, including the burden test, the sequence kernel association test (SKAT), SKAT-O and an efficient hybrid test of the burden test and SKAT, based on user-defined variant sets.

Maintained by Han Chen. Last updated 1 years ago.

openblas zlib bzip2 libzstd libdeflate cpp

0.5 match 38 stars 8.34 score 96 scripts 2 dependents

ochoalab

simfam:Simulate and Model Family Pedigrees with Structured Founders

The focus is on simulating and modeling families with founders drawn from a structured population (for example, with different ancestries or other potentially non-family relatedness), in contrast to traditional pedigree analysis that treats all founders as equally unrelated. Main function simulates a random pedigree for many generations, avoiding close relatives, pairing closest individuals according to a 1D geography and their randomly-drawn sex, and with variable children sizes to result in a target population size per generation. Auxiliary functions calculate kinship matrices, admixture matrices, and draw random genotypes across arbitrary pedigree structures starting from the corresponding founder values. The code is built around the plink FAM table format for pedigrees. There are functions that simulate independent loci and also functions that use an explicit recombination model to simulate linkage disequilibrium (LD) in the pedigree, as well as population analogs resembling the Li-Stephens model. Described in Yao and Ochoa (2023) <doi:10.7554/eLife.79238>.

Maintained by Alejandro Ochoa. Last updated 2 months ago.

cpp

0.5 match 3 stars 5.12 score 11 scripts

kmkuesters

pooledpeaks:Genetic Analysis of Pooled Samples

Analyzing genetic data obtained from pooled samples. This package can read in Fragment Analysis output files, process the data, and score peaks, as well as facilitate various analyses, including cluster analysis, calculation of genetic distances and diversity indices, as well as bootstrap resampling for statistical inference. Specifically tailored to handle genetic data efficiently, researchers can explore population structure, genetic differentiation, and genetic relatedness among samples. We updated some functions from Covarrubias-Pazaran et al. (2016) <doi:10.1186/s12863-016-0365-6> to allow for the use of new file formats and referenced the following to write our genetic analysis functions: Long et al. (2022) <doi:10.1038/s41598-022-04776-0>, Jost (2008) <doi:10.1111/j.1365-294x.2008.03887.x>, Nei (1973) <doi:10.1073/pnas.70.12.3321>, Foulley et al. (2006) <doi:10.1016/j.livprodsci.2005.10.021>, Chao et al. (2008) <doi:10.1111/j.1541-0420.2008.01010.x>.

Maintained by Kathleen Kuesters. Last updated 3 days ago.

0.5 match 1 stars 4.85 score 3 scripts

bioc

goSTAG:A tool to use GO Subtrees to Tag and Annotate Genes within a set

Gene lists derived from the results of genomic analyses are rich in biological information. For instance, differentially expressed genes (DEGs) from a microarray or RNA-Seq analysis are related functionally in terms of their response to a treatment or condition. Gene lists can vary in size, up to several thousand genes, depending on the robustness of the perturbations or how widely different the conditions are biologically. Having a way to associate biological relatedness between hundreds and thousands of genes systematically is impractical by manually curating the annotation and function of each gene. Over-representation analysis (ORA) of genes was developed to identify biological themes. Given a Gene Ontology (GO) and an annotation of genes that indicate the categories each one fits into, significance of the over-representation of the genes within the ontological categories is determined by a Fisher's exact test or modeling according to a hypergeometric distribution. Comparing a small number of enriched biological categories for a few samples is manageable using Venn diagrams or other means for assessing overlaps. However, with hundreds of enriched categories and many samples, the comparisons are laborious. Furthermore, if there are enriched categories that are shared between samples, trying to represent a common theme across them is highly subjective. goSTAG uses GO subtrees to tag and annotate genes within a set. goSTAG visualizes the similarities between the over-representation of DEGs by clustering the p-values from the enrichment statistical tests and labels clusters with the GO term that has the most paths to the root within the subtree generated from all the GO terms in the cluster.

Maintained by Brian D. Bennett. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment clustering microarray mrnamicroarray rnaseq visualization go immunooncology

0.5 match 4.30 score 1 scripts

magnusdv

pedbuildr:Pedigree Reconstruction

Reconstruct pedigrees from genotype data, by optimising the likelihood over all possible pedigrees subject to given restrictions. Tailor-made plots facilitate evaluation of the output. This package is part of the 'pedsuite' ecosystem for pedigree analysis. In particular, it imports 'pedprobr' for calculating pedigree likelihoods and 'forrel' for estimating pairwise relatedness.

Maintained by Magnus Dehli Vigeland. Last updated 2 months ago.

0.5 match 2 stars 3.78 score 7 scripts 1 dependents

wenlongren

ScoreEB:Score Test Integrated with Empirical Bayes for Association Study

Perform association test within linear mixed model framework using score test integrated with empirical bayes for genome-wide association study. Firstly, score test was conducted for each single nucleotide polymorphism (SNP) under linear mixed model framework, taking into account the genetic relatedness and population structure. And then all the potentially associated SNPs were selected with a less stringent criterion. Finally, all the selected SNPs were performed empirical bayes in a multi-locus model to identify the true quantitative trait nucleotide (QTN).

Maintained by Wenlong Ren. Last updated 3 years ago.

0.5 match 2 stars 3.00 score 1 scripts

cran

PQLseq:Efficient Mixed Model Analysis of Count Data in Large-Scale Genomic Sequencing Studies

An efficient tool designed for differential analysis of large-scale RNA sequencing (RNAseq) data and Bisulfite sequencing (BSseq) data in the presence of individual relatedness and population structure. 'PQLseq' first fits a Generalized Linear Mixed Model (GLMM) with adjusted covariates, predictor of interest and random effects to account for population structure and individual relatedness, and then performs Wald tests for each gene in RNAseq or site in BSseq.

Maintained by Jiaqiang Zhu. Last updated 4 years ago.

openblas cpp openmp

0.8 match 1.60 score

kenstoyama

JNplots:Visualize Outputs from the 'Johnson-Neyman' Technique

Aids in the calculation and visualization of regions of non-significance using the 'Johnson-Neyman' technique and its extensions as described by Bauer and Curran (2005) <doi:10.1207/s15327906mbr4003_5> to assess the influence of categorical and continuous moderators. Allows correcting for phylogenetic relatedness.

Maintained by Ken Toyama. Last updated 1 years ago.

0.5 match 2.00 score 4 scripts

cran

QTLRel:Tools for Mapping of Quantitative Traits of Genetically Related Individuals and Calculating Identity Coefficients from Pedigrees

This software provides tools for quantitative trait mapping in populations such as advanced intercross lines where relatedness among individuals should not be ignored. It can estimate background genetic variance components, impute missing genotypes, simulate genotypes, perform a genome scan for putative quantitative trait loci (QTL), and plot mapping results. It also has functions to calculate identity coefficients from pedigrees, especially suitable for pedigrees that consist of a large number of generations, or estimate identity coefficients from genotypic data in certain circumstances.

Maintained by Riyan Cheng. Last updated 2 years ago.

fortran openblas

0.5 match 2.00 score

xiaoran831213

plinkFile:'PLINK' (and 'GCTA') File Helpers

reads/write binary genotype file compatable with 'PLINK' <https://www.cog-genomics.org/plink/1.9/input#bed> into/from a R matrix; traverse genotype data one windows of variants at a time, like apply() or a for loop; reads/writes genotype relatedness/kinship matrices created by 'PLINK' <https://www.cog-genomics.org/plink/1.9/distance#make_rel> or 'GCTA' <https://cnsgenomics.com/software/gcta/#MakingaGRM> into/from a R square matrix. It is best used for bringing data produced by 'PLINK' and 'GCTA' into R workflow.

Maintained by Xiaoran Tong. Last updated 3 years ago.

0.5 match 2.00 score 2 scripts