Showing 44 of total 44 results (show query)
bioc
SNPRelate:Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data
Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.
Maintained by Xiuwen Zheng. Last updated 5 months ago.
infrastructuregeneticsstatisticalmethodprincipalcomponentbioinformaticsgds-formatpcasimdsnpopenblascpp
14.4 match 104 stars 12.69 score 1.6k scripts 18 dependentsdoer0
ibd:Incomplete Block Designs
A collection of several utility functions related to binary incomplete block designs. Contains function to generate A- and D-efficient binary incomplete block designs with given numbers of treatments, number of blocks and block size. Contains function to generate an incomplete block design with specified concurrence matrix. There are functions to generate balanced treatment incomplete block designs and incomplete block designs for test versus control treatments comparisons with specified concurrence matrix. Allows performing analysis of variance of data and computing estimated marginal means of factors from experiments using a connected incomplete block design. Tests of hypothesis of treatment contrasts in incomplete block design set up is supported.
Maintained by B N Mandal. Last updated 1 years ago.
60.6 match 2.95 score 33 scripts 3 dependentsbiometris
statgenIBD:Calculation of IBD Probabilities
For biparental, three and four-way crosses Identity by Descent (IBD) probabilities can be calculated using Hidden Markov Models and inheritance vectors following Lander and Green (<https://www.jstor.org/stable/29713>) and Huang (<doi:10.1073/pnas.1100465108>). One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris.
Maintained by Bart-Jan van Rossum. Last updated 1 months ago.
33.8 match 5.26 score 8 scripts 1 dependentsbioc
hapFabia:hapFabia: Identification of very short segments of identity by descent (IBD) characterized by rare variants in large sequencing data
A package to identify very short IBD segments in large sequencing data by FABIA biclustering. Two haplotypes are identical by descent (IBD) if they share a segment that both inherited from a common ancestor. Current IBD methods reliably detect long IBD segments because many minor alleles in the segment are concordant between the two haplotypes. However, many cohort studies contain unrelated individuals which share only short IBD segments. This package provides software to identify short IBD segments in sequencing data. Knowledge of short IBD segments are relevant for phasing of genotyping data, association studies, and for population genetics, where they shed light on the evolutionary history of humans. The package supports VCF formats, is based on sparse matrix operations, and provides visualization of haplotype clusters in different formats.
Maintained by Andreas Mitterecker. Last updated 5 months ago.
geneticsgeneticvariabilitysnpsequencingvisualizationclusteringsequencematchingsoftware
33.3 match 3.30 score 9 scriptsstatdivlab
corncob:Count Regression for Correlated Observations with the Beta-Binomial
Statistical modeling for correlated count data using the beta-binomial distribution, described in Martin et al. (2020) <doi:10.1214/19-AOAS1283>. It allows for both mean and overdispersion covariates.
Maintained by Amy D Willis. Last updated 9 hours ago.
10.7 match 106 stars 9.82 score 248 scripts 1 dependentsmagnusdv
ribd:Pedigree-based Relatedness Coefficients
Recursive algorithms for computing various relatedness coefficients, including pairwise kinship, kappa and identity coefficients. Both autosomal and X-linked coefficients are computed. Founders are allowed to be inbred, which enables construction of any given kappa coefficients, as described in Vigeland (2020) <doi:10.1007/s00285-020-01505-x>. In addition to the standard coefficients, 'ribd' also computes a range of lesser-known coefficients, including generalised kinship coefficients, multi-person coefficients and two-locus coefficients (Vigeland, 2023, <doi:10.1093/g3journal/jkac326>). Many features of 'ribd' are available through the online app 'QuickPed' at <https://magnusdv.shinyapps.io/quickped>; see Vigeland (2022) <doi:10.1186/s12859-022-04759-y>.
Maintained by Magnus Dehli Vigeland. Last updated 1 months ago.
inbreeding-coefficientkinshippedigree-analysisrelatedness
13.6 match 6 stars 5.95 score 10 scripts 11 dependentsbowenwang7
rres:Realized Relatedness Estimation and Simulation
Functions for studying realized genetic relatedness between people. Users will be able to simulate inheritance patterns given pedigree structures, generate SNP marker data given inheritance patterns, and estimate realized relatedness between pairs of individuals using SNP marker data. See Wang (2017) <doi:10.1534/genetics.116.197004>. This work was supported by National Institutes of Health grants R37 GM-046255.
Maintained by Bowen Wang. Last updated 7 years ago.
22.9 match 2.95 score 18 scriptsmagnusdv
forrel:Forensic Pedigree Analysis and Relatedness Inference
Forensic applications of pedigree analysis, including likelihood ratios for relationship testing, general relatedness inference, marker simulation, and power analysis. 'forrel' is part of the 'pedsuite', a collection of packages for pedigree analysis, further described in the book 'Pedigree Analysis in R' (Vigeland, 2021, ISBN:9780128244302). Several functions deal specifically with power analysis in missing person cases, implementing methods described in Vigeland et al. (2020) <doi:10.1016/j.fsigen.2020.102376>. Data import from the 'Familias' software (Egeland et al. (2000) <doi:10.1016/S0379-0738(00)00147-X>) is supported through the 'pedFamilias' package.
Maintained by Magnus Dehli Vigeland. Last updated 6 days ago.
9.2 match 11 stars 6.98 score 63 scripts 7 dependentsmagnusdv
ibdsim2:Simulation of Chromosomal Regions Shared by Family Members
Simulation of segments shared identical-by-descent (IBD) by pedigree members. Using sex specific recombination rates along the human genome (Halldorsson et al. (2019) <doi:10.1126/science.aau1043>), phased chromosomes are simulated for all pedigree members. Applications include calculation of realised relatedness coefficients and IBD segment distributions. 'ibdsim2' is part of the 'pedsuite' collection of packages for pedigree analysis. A detailed presentation of the 'pedsuite', including a separate chapter on 'ibdsim2', is available in the book 'Pedigree analysis in R' (Vigeland, 2021, ISBN:9780128244302). A 'Shiny' app for visualising and comparing IBD distributions is available at <https://magnusdv.shinyapps.io/ibdsim2-shiny/>.
Maintained by Magnus Dehli Vigeland. Last updated 15 days ago.
identity-by-descentrelatednesssimulationcpp
10.6 match 5 stars 5.00 score 19 scripts 1 dependentsdahhamalsoud
phdcocktail:Enhance the Ease of R Experience as an Emerging Researcher
A toolkit of functions to help: i) effortlessly transform collected data into a publication ready format, ii) generate insightful visualizations from clinical data, iii) report summary statistics in a publication-ready format, iv) efficiently export, save and reload R objects within the framework of R projects.
Maintained by Dahham Alsoud. Last updated 1 years ago.
12.4 match 3.70 score 1 scriptspbourkey
polyqtlR:QTL Analysis in Autopolyploid Bi-Parental F1 Populations
Quantitative trait loci (QTL) analysis and exploration of meiotic patterns in autopolyploid bi-parental F1 populations. For all ploidy levels, identity-by-descent (IBD) probabilities can be estimated. Significance thresholds, exploring QTL allele effects and visualising results are provided. For more background and to reference the package see <doi:10.1093/bioinformatics/btab574>.
Maintained by Peter Bourke. Last updated 1 years ago.
17.3 match 2.30 score 2 scriptsbiometris
statgenMPP:QTL Mapping for Multi Parent Populations
For Multi Parent Populations (MPP) Identity By Descend (IBD) probabilities are computed using Hidden Markov Models. These probabilities are then used in a mixed model approach for QTL Mapping as described in Li et al. (<doi:10.1007/s00122-021-03919-7>).
Maintained by Bart-Jan van Rossum. Last updated 1 months ago.
6.1 match 4.94 score 11 scriptsvincentgarin
mppR:Multi-Parent Population QTL Analysis
Analysis of experimental multi-parent populations to detect regions of the genome (called quantitative trait loci, QTLs) influencing phenotypic traits measured in unique and multiple environments. The population must be composed of crosses between a set of at least three parents (e.g. factorial design, 'diallel', or nested association mapping). The functions cover data processing, QTL detection, and results visualization. The implemented methodology is described in Garin, Wimmer, Mezmouk, Malosetti and van Eeuwijk (2017) <doi:10.1007/s00122-017-2923-3>, in Garin, Malosetti and van Eeuwijk (2020) <doi: 10.1007/s00122-020-03621-0>, and in Garin, Diallo, Tekete, Thera, ..., and Rami (2024) <doi: 10.1093/genetics/iyae003>.
Maintained by Vincent Garin. Last updated 1 years ago.
5.6 match 2 stars 5.35 score 28 scriptssignaturescience
skater:Utilities for SNP-Based Kinship Analysis
Utilities for single nucleotide polymorphism (SNP) based kinship analysis testing and evaluation. The 'skater' package contains functions for importing, parsing, and analyzing pedigree data, performing relationship degree inference, benchmarking relationship degree classification, and summarizing identity by descent (IBD) segment data. Package functions and methods are described in Turner et al. (2021) "skater: An R package for SNP-based Kinship Analysis, Testing, and Evaluation" <doi:10.1101/2021.07.21.453083>.
Maintained by Stephen Turner. Last updated 2 years ago.
5.5 match 9 stars 5.26 score 7 scriptssoroushmdg
gwid:Genome-Wide Identity-by-Descent
Methods and tools for the analysis of Genome Wide Identity-by-Descent ('gwid') mapping data, focusing on testing whether there is a higher occurrence of Identity-By-Descent (IBD) segments around potential causal variants in cases compared to controls, which is crucial for identifying rare variants. To enhance its analytical power, 'gwid' incorporates a Sliding Window Approach, allowing for the detection and analysis of signals from multiple Single Nucleotide Polymorphisms (SNPs).
Maintained by Soroush Mahmoudiandehkordi. Last updated 6 months ago.
7.6 match 1 stars 3.60 score 4 scriptsmeyer-lab-cshl
plinkQC:Genotype Quality Control with 'PLINK'
Genotyping arrays enable the direct measurement of an individuals genotype at thousands of markers. 'plinkQC' facilitates genotype quality control for genetic association studies as described by Anderson and colleagues (2010) <doi:10.1038/nprot.2010.116>. It makes 'PLINK' basic statistics (e.g. missing genotyping rates per individual, allele frequencies per genetic marker) and relationship functions accessible from 'R' and generates a per-individual and per-marker quality control report. Individuals and markers that fail the quality control can subsequently be removed to generate a new, clean dataset. Removal of individuals based on relationship status is optimised to retain as many individuals as possible in the study.
Maintained by Hannah Meyer. Last updated 3 years ago.
3.7 match 58 stars 6.75 score 49 scriptsbioc
cypress:Cell-Type-Specific Power Assessment
CYPRESS is a cell-type-specific power tool. This package aims to perform power analysis for the cell-type-specific data. It calculates FDR, FDC, and power, under various study design parameters, including but not limited to sample size, and effect size. It takes the input of a SummarizeExperimental(SE) object with observed mixture data (feature by sample matrix), and the cell-type mixture proportions (sample by cell-type matrix). It can solve the cell-type mixture proportions from the reference free panel from TOAST and conduct tests to identify cell-type-specific differential expression (csDE) genes.
Maintained by Shilin Yu. Last updated 5 months ago.
softwaregeneexpressiondataimportrnaseqsequencing
6.6 match 1 stars 3.70 score 2 scriptshiggi13425
medicaldata:Data Package for Medical Datasets
Provides access to well-documented medical datasets for teaching. Featuring several from the Teaching of Statistics in the Health Sciences website <https://www.causeweb.org/tshs/category/dataset/>, a few reconstructed datasets of historical significance in medical research, some reformatted and extended from existing R packages, and some data donations.
Maintained by Peter Higgins. Last updated 2 years ago.
3.2 match 48 stars 7.43 score 317 scriptsdavid-barnett
microViz:Microbiome Data Analysis and Visualization
Microbiome data visualization and statistics tools built upon phyloseq.
Maintained by David Barnett. Last updated 3 months ago.
microbiomemicrobiome-analysismicrobiota
3.6 match 114 stars 6.22 score 480 scriptsgaynorr
AlphaSimR:Breeding Program Simulations
The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) <doi:10.3835/plantgenome2016.02.0013>]. Used for stochastic simulations of breeding programs to the level of DNA sequence for every individual. Contained is a wide range of functions for modeling common tasks in a breeding program, such as selection and crossing. These functions allow for constructing simulations of highly complex plant and animal breeding programs via scripting in the R software environment. Such simulations can be used to evaluate overall breeding program performance and conduct research into breeding program design, such as implementation of genomic selection. Included is the 'Markovian Coalescent Simulator' ('MaCS') for fast simulation of biallelic sequences according to a population demographic history [Chen et al. (2009) <doi:10.1101/gr.083634.108>].
Maintained by Chris Gaynor. Last updated 5 months ago.
breedinggenomicssimulationopenblascppopenmp
2.0 match 47 stars 10.22 score 534 scripts 2 dependentspetergreen5678
KinMixLite:Inference About Relationships from DNA Mixtures
Methods for inference about relationships between contributors to a DNA mixture and other individuals of known genotype: a basic example would be testing whether a contributor to a mixture is the father of a child of known genotype. This provides most of the functionality of the 'KinMix' package, but with some loss of efficiency and restriction on problem size, as the latter uses 'RHugin' as the Bayes net engine, while this package uses 'gRain'. The package implements the methods introduced in Green, P. J. and Mortera, J. (2017) <doi:10.1016/j.fsigen.2017.02.001> and Green, P. J. and Mortera, J. (2021) <doi:10.1111/rssc.12498>.
Maintained by Peter Green. Last updated 5 months ago.
18.4 match 1.00 scorebioc
SeqSQC:A bioconductor package for sample quality check with next generation sequencing data
The SeqSQC is designed to identify problematic samples in NGS data, including samples with gender mismatch, contamination, cryptic relatedness, and population outlier.
Maintained by Qian Liu. Last updated 5 months ago.
experiment datahomo_sapiens_datasequencing dataproject1000genomesgenome
3.3 match 5.38 score 2 scriptsrqtl
qtl2:Quantitative Trait Locus Mapping in Experimental Crosses
Provides a set of tools to perform quantitative trait locus (QTL) analysis in experimental crosses. It is a reimplementation of the 'R/qtl' package to better handle high-dimensional data and complex cross designs. Broman et al. (2019) <doi:10.1534/genetics.118.301595>.
Maintained by Karl W Broman. Last updated 10 days ago.
1.8 match 34 stars 9.48 score 1.1k scripts 5 dependentsbodkan
slendr:A Simulation Framework for Spatiotemporal Population Genetics
A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.
Maintained by Martin Petr. Last updated 14 days ago.
popgenpopulation-geneticssimulationsspatial-statistics
1.8 match 56 stars 9.15 score 88 scriptsbioc
Pedixplorer:Pedigree Functions
Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.
Maintained by Louis Le Nezet. Last updated 2 days ago.
softwaredatarepresentationgeneticsgraphandnetworkvisualizationkinshippedigree
2.3 match 2 stars 6.08 score 10 scriptscran
bdsmatrix:Routines for Block Diagonal Symmetric Matrices
This is a special case of sparse matrices, used by coxme.
Maintained by Terry Therneau. Last updated 1 years ago.
2.3 match 1 stars 5.91 score 202 dependentscran
IBDsim:Simulation of Chromosomal Regions Shared by Family Members
Simulation of segments shared identical-by-descent (IBD) by pedigree members. Using sex specific recombination rates along the human genome (Kong et. al (2010) <doi:10.1038/nature09525>), phased chromosomes are simulated for all pedigree members, either by unconditional gene dropping or conditional on a specified IBD pattern. Additional functions provide summaries and further analysis of the simulated genomes.
Maintained by Magnus Dehli Vigeland. Last updated 6 years ago.
6.7 match 1.78 scorehighlanderlab
SIMplyBee:'AlphaSimR' Extension for Simulating Honeybee Populations and Breeding Programmes
An extension of the 'AlphaSimR' package (<https://cran.r-project.org/package=AlphaSimR>) for stochastic simulations of honeybee populations and breeding programmes. 'SIMplyBee' enables simulation of individual bees that form a colony, which includes a queen, fathers (drones the queen mated with), virgin queens, workers, and drones. Multiple colony can be merged into a population of colonies, such as an apiary or a whole country of colonies. Functions enable operations on castes, colony, or colonies, to ease 'R' scripting of whole populations. All 'AlphaSimR' functionality with respect to genomes and genetic and phenotype values is available and further extended for honeybees, including haplo-diploidy, complementary sex determiner locus, colony events (swarming, supersedure, etc.), and colony phenotype values.
Maintained by Jana Obšteter. Last updated 6 months ago.
1.8 match 2 stars 6.24 score 18 scriptsbiometris
LMMsolver:Linear Mixed Model Solver
An efficient and flexible system to solve sparse mixed model equations. Important applications are the use of splines to model spatial or temporal trends as described in Boer (2023). (<doi:10.1177/1471082X231178591>).
Maintained by Bart-Jan van Rossum. Last updated 2 months ago.
1.3 match 11 stars 8.14 score 66 scripts 3 dependentsvjontiveros
island:Stochastic Island Biogeography Theory Made Easy
Develops stochastic models based on the Theory of Island Biogeography (TIB) of MacArthur and Wilson (1967) <doi:10.1023/A:1016393430551> and extensions. It implements methods to estimate colonization and extinction rates (including environmental variables) given presence-absence data, simulates community assembly, and performs model selection.
Maintained by Vicente Jimenez. Last updated 1 years ago.
3.0 match 3.10 score 42 scriptsf-rousset
genepop:Population Genetic Data Analysis Using Genepop
Makes the Genepop software available in R. This software implements a mixture of traditional population genetic methods and some more focused developments: it computes exact tests for Hardy-Weinberg equilibrium, for population differentiation and for genotypic disequilibrium among pairs of loci; it computes estimates of F-statistics, null allele frequencies, allele size-based statistics for microsatellites, etc.; and it performs analyses of isolation by distance from pairwise comparisons of individuals or population samples.
Maintained by François Rousset. Last updated 2 years ago.
3.3 match 1 stars 2.78 score 54 scriptsbioc
GENESIS:GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness
The GENESIS package provides methodology for estimating, inferring, and accounting for population and pedigree structure in genetic analyses. The current implementation provides functions to perform PC-AiR (Conomos et al., 2015, Gen Epi) and PC-Relate (Conomos et al., 2016, AJHG). PC-AiR performs a Principal Components Analysis on genome-wide SNP data for the detection of population structure in a sample that may contain known or cryptic relatedness. Unlike standard PCA, PC-AiR accounts for relatedness in the sample to provide accurate ancestry inference that is not confounded by family structure. PC-Relate uses ancestry representative principal components to adjust for population structure/ancestry and accurately estimate measures of recent genetic relatedness such as kinship coefficients, IBD sharing probabilities, and inbreeding coefficients. Additionally, functions are provided to perform efficient variance component estimation and mixed model association testing for both quantitative and binary phenotypes.
Maintained by Stephanie M. Gogarten. Last updated 2 months ago.
snpgeneticvariabilitygeneticsstatisticalmethoddimensionreductionprincipalcomponentgenomewideassociationqualitycontrolbiocviews
0.5 match 36 stars 10.44 score 342 scripts 1 dependentsgreen-striped-gecko
dartR.spatial:Applying Landscape Genomic Methods on 'SNP' and 'Silicodart' Data
Provides landscape genomic functions to analyse 'SNP' (single nuclear polymorphism) data, such as least cost path analysis and isolation by distance. Therefore each sample needs to have coordinate data attached (lat/lon) to be able to run most of the functions. 'dartR.spatial' is a package that belongs to the 'dartRverse' suit of packages and depends on 'dartR.base' and 'dartR.data'.
Maintained by Bernd Gruber. Last updated 1 years ago.
2.3 match 2.00 scorecran
sim1000G:Genotype Simulations for Rare or Common Variants Using Haplotypes from 1000 Genomes
Generates realistic simulated genetic data in families or unrelated individuals.
Maintained by Apostolos Dimitromanolakis. Last updated 6 years ago.
1.3 match 1 stars 2.78 scorecran
GENLIB:Genealogical Data Analysis
Genealogical data analysis including descriptive statistics (e.g., kinship and inbreeding coefficients) and gene-dropping simulations. See: "GENLIB: an R package for the analysis of genealogical data" Gauvin et al. (2015) <doi:10.1186/s12859-015-0581-5>.
Maintained by Marie-Helene Roy-Gagnon. Last updated 1 years ago.
1.8 match 1.78 score 1 dependentsashutoshdalal97
SudokuDesigns:Sudoku as an Experimental Design
Sudoku designs (Bailey et al., 2008<doi:10.1080/00029890.2008.11920542>) can be used as experimental designs which tackle one extra source of variation than conventional Latin square designs. Although Sudoku designs are similar to Latin square designs, only addition is the region concept. Some very important functions related to row-column designs as well as block designs along with basic functions are included in this package.
Maintained by Ashutosh Dalal. Last updated 4 months ago.
1.8 match 1.68 score 24 scriptseppicenter
dcifer:Genetic Relatedness Between Polyclonal Infections
An implementation of Dcifer (Distance for complex infections: fast estimation of relatedness), an identity by descent (IBD) based method to calculate genetic relatedness between polyclonal infections from biallelic and multiallelic data. The package includes functions that format and preprocess the data, implement the method, and visualize the results. Gerlovina et al. (2022) <doi:10.1093/genetics/iyac126>.
Maintained by Inna Gerlovina. Last updated 10 months ago.
0.5 match 5 stars 4.57 score 15 scriptsashutoshdalal97
HadIBDs:Incomplete Block Designs using Hadamard Matrix (HadIBDs)
Hadamard matrix based statistical designs are of immense importance as the resultant designs carry various desirable characterizing properties. Constructing Partially Balanced Incomplete Block Designs (PBIBds) using Kronecker product of incidence matrices of Balanced Incomplete Block (BIB) and Partially Balanced Incomplete Block (PBIB) designs is much evident from literature. Here, we have constructed Incomplete Block Designs (IBDs) based on Hadamard matrices and Kronecker product of Hadamard matrices.
Maintained by Ashutosh Dalal. Last updated 7 months ago.
0.5 match 1.00 score 3 scripts