Showing 35 of total 35 results (show query)
bioc
DECIPHER:Tools for curating, analyzing, and manipulating biological sequences
A toolset for deciphering and managing biological sequences.
Maintained by Erik Wright. Last updated 19 days ago.
clusteringgeneticssequencingdataimportvisualizationmicroarrayqualitycontrolqpcralignmentwholegenomemicrobiomeimmunooncologygenepredictionopenmp
10.55 score 1.1k scripts 14 dependentsbioc
rGREAT:GREAT Analysis - Functional Enrichment on Genomic Regions
GREAT (Genomic Regions Enrichment of Annotations Tool) is a type of functional enrichment analysis directly performed on genomic regions. This package implements the GREAT algorithm (the local GREAT analysis), also it supports directly interacting with the GREAT web service (the online GREAT analysis). Both analysis can be viewed by a Shiny application. rGREAT by default supports more than 600 organisms and a large number of gene set collections, as well as self-provided gene sets and organisms from users. Additionally, it implements a general method for dealing with background regions.
Maintained by Zuguang Gu. Last updated 17 days ago.
genesetenrichmentgopathwayssoftwaresequencingwholegenomegenomeannotationcoveragecpp
86 stars 9.96 score 320 scripts 1 dependentsbioc
TitanCNA:Subclonal copy number and LOH prediction from whole genome sequencing of tumours
Hidden Markov model to segment and predict regions of subclonal copy number alterations (CNA) and loss of heterozygosity (LOH), and estimate cellular prevalence of clonal clusters in tumour whole genome sequencing data.
Maintained by Gavin Ha. Last updated 5 months ago.
sequencingwholegenomednaseqexomeseqstatisticalmethodcopynumbervariationhiddenmarkovmodelgeneticsgenomicvariationimmunooncology10x-genomicscopy-number-variationgenome-sequencinghmmtumor-heterogeneity
97 stars 8.47 score 68 scriptsbioc
AneuFinder:Analysis of Copy Number Variation in Single-Cell-Sequencing Data
AneuFinder implements functions for copy-number detection, breakpoint detection, and karyotype and heterogeneity analysis in single-cell whole genome sequencing and strand-seq data.
Maintained by Aaron Taudt. Last updated 4 days ago.
immunooncologysoftwaresequencingsinglecellcopynumbervariationgenomicvariationhiddenmarkovmodelwholegenomecpp
18 stars 7.90 score 37 scriptsbioc
ACE:Absolute Copy Number Estimation from Low-coverage Whole Genome Sequencing
Uses segmented copy number data to estimate tumor cell percentage and produce copy number plots displaying absolute copy numbers.
Maintained by Jos B Poell. Last updated 5 months ago.
copynumbervariationdnaseqcoveragewholegenomevisualizationsequencing
15 stars 7.03 score 18 scriptsbioc
syntenet:Inference And Analysis Of Synteny Networks
syntenet can be used to infer synteny networks from whole-genome protein sequences and analyze them. Anchor pairs are detected with the MCScanX algorithm, which was ported to this package with the Rcpp framework for R and C++ integration. Anchor pairs from synteny analyses are treated as an undirected unweighted graph (i.e., a synteny network), and users can perform: i. network clustering; ii. phylogenomic profiling (by identifying which species contain which clusters) and; iii. microsynteny-based phylogeny reconstruction with maximum likelihood.
Maintained by FabrĂcio Almeida-Silva. Last updated 3 months ago.
softwarenetworkinferencefunctionalgenomicscomparativegenomicsphylogeneticssystemsbiologygraphandnetworkwholegenomenetworkcomparative-genomicsevolutionary-genomicsnetwork-sciencephylogenomicssyntenysynteny-networkcpp
28 stars 6.70 score 12 scripts 1 dependentsbioc
doubletrouble:Identification and classification of duplicated genes
doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.
Maintained by FabrĂcio Almeida-Silva. Last updated 17 days ago.
softwarewholegenomecomparativegenomicsfunctionalgenomicsphylogeneticsnetworkclassificationbioinformaticscomparative-genomicsgene-duplicationmolecular-evolutionwhole-genome-duplication
23 stars 6.44 score 17 scriptsbioc
dmrseq:Detection and inference of differentially methylated regions from Whole Genome Bisulfite Sequencing
This package implements an approach for scanning the genome to detect and perform accurate inference on differentially methylated regions from Whole Genome Bisulfite Sequencing data. The method is based on comparing detected regions to a pooled null distribution, that can be implemented even when as few as two samples per population are available. Region-level statistics are obtained by fitting a generalized least squares (GLS) regression model with a nested autoregressive correlated error structure for the effect of interest on transformed methylation proportions.
Maintained by Keegan Korthauer. Last updated 5 months ago.
immunooncologydnamethylationepigeneticsmultiplecomparisonsoftwaresequencingdifferentialmethylationwholegenomeregressionfunctionalgenomics
6.39 score 59 scripts 1 dependentsbioc
RAIDS:Accurate Inference of Genetic Ancestry from Cancer Sequences
This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49â58.
Maintained by Pascal Belleau. Last updated 5 months ago.
geneticssoftwaresequencingwholegenomeprincipalcomponentgeneticvariabilitydimensionreductionbiocviewsancestrycancer-genomicsexome-sequencinggenomicsinferencer-languagerna-seqrna-sequencingwhole-genome-sequencing
5 stars 6.23 score 19 scriptsbioc
MungeSumstats:Standardise summary statistics from GWAS
The *MungeSumstats* package is designed to facilitate the standardisation of GWAS summary statistics. It reformats inputted summary statisitics to include SNP, CHR, BP and can look up these values if any are missing. It also pefrorms dozens of QC and filtering steps to ensure high data quality and minimise inter-study differences.
Maintained by Alan Murphy. Last updated 3 months ago.
snpwholegenomegeneticscomparativegenomicsgenomewideassociationgenomicvariationpreprocessing
3 stars 6.23 score 91 scriptsbioc
cfDNAPro:cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA
cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.
Maintained by Haichao Wang. Last updated 5 months ago.
visualizationsequencingwholegenomebioinformaticscancer-genomicscancer-researchcell-free-dnaearly-detectiongenomics-visualizationliquid-biopsyswgswhole-genome-sequencing
29 stars 6.18 score 13 scriptsbioc
SCOPE:A normalization and copy number estimation method for single-cell DNA sequencing
Whole genome single-cell DNA sequencing (scDNA-seq) enables characterization of copy number profiles at the cellular level. This circumvents the averaging effects associated with bulk-tissue sequencing and has increased resolution yet decreased ambiguity in deconvolving cancer subclones and elucidating cancer evolutionary history. ScDNA-seq data is, however, sparse, noisy, and highly variable even within a homogeneous cell population, due to the biases and artifacts that are introduced during the library preparation and sequencing procedure. Here, we propose SCOPE, a normalization and copy number estimation method for scDNA-seq data. The distinguishing features of SCOPE include: (i) utilization of cell-specific Gini coefficients for quality controls and for identification of normal/diploid cells, which are further used as negative control samples in a Poisson latent factor model for normalization; (ii) modeling of GC content bias using an expectation-maximization algorithm embedded in the Poisson generalized linear models, which accounts for the different copy number states along the genome; (iii) a cross-sample iterative segmentation procedure to identify breakpoints that are shared across cells from the same genetic background.
Maintained by Rujin Wang. Last updated 5 months ago.
singlecellnormalizationcopynumbervariationsequencingwholegenomecoveragealignmentqualitycontroldataimportdnaseq
5.92 score 84 scriptsbioc
TVTB:TVTB: The VCF Tool Box
The package provides S4 classes and methods to filter, summarise and visualise genetic variation data stored in VCF files. In particular, the package extends the FilterRules class (S4Vectors package) to define news classes of filter rules applicable to the various slots of VCF objects. Functionalities are integrated and demonstrated in a Shiny web-application, the Shiny Variant Explorer (tSVE).
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
softwaregeneticsgeneticvariabilitygenomicvariationdatarepresentationguidnaseqwholegenomevisualizationmultiplecomparisondataimportvariantannotationsequencingcoveragealignmentsequencematching
2 stars 5.76 score 16 scriptsbioc
globalSeq:Global Test for Counts
The method may be conceptualised as a test of overall significance in regression analysis, where the response variable is overdispersed and the number of explanatory variables exceeds the sample size. Useful for testing for association between RNA-Seq and high-dimensional data.
Maintained by Armin Rauschenberger. Last updated 5 months ago.
geneexpressionexonarraydifferentialexpressiongenomewideassociationtranscriptomicsdimensionreductionregressionsequencingwholegenomernaseqexomeseqmirnamultiplecomparison
1 stars 5.32 score 4 scriptsbioc
cfdnakit:Fragmen-length analysis package from high-throughput sequencing of cell-free DNA (cfDNA)
This package provides basic functions for analyzing shallow whole-genome sequencing (~0.3X or more) of cell-free DNA (cfDNA). The package basically extracts the length of cfDNA fragments and aids the vistualization of fragment-length information. The package also extract fragment-length information per non-overlapping fixed-sized bins and used it for calculating ctDNA estimation score (CES).
Maintained by Pitithat Puranachot. Last updated 5 months ago.
copynumbervariationsequencingwholegenome
8 stars 5.20 score 8 scriptsbioc
methylCC:Estimate the cell composition of whole blood in DNA methylation samples
A tool to estimate the cell composition of DNA methylation whole blood sample measured on any platform technology (microarray and sequencing).
Maintained by Stephanie C. Hicks. Last updated 5 months ago.
microarraysequencingdnamethylationmethylationarraymethylseqwholegenome
19 stars 5.18 score 8 scriptsbioc
DMRScan:Detection of Differentially Methylated Regions
This package detects significant differentially methylated regions (for both qualitative and quantitative traits), using a scan statistic with underlying Poisson heuristics. The scan statistic will depend on a sequence of window sizes (# of CpGs within each window) and on a threshold for each window size. This threshold can be calculated by three different means: i) analytically using Siegmund et.al (2012) solution (preferred), ii) an important sampling as suggested by Zhang (2008), and a iii) full MCMC modeling of the data, choosing between a number of different options for modeling the dependency between each CpG.
Maintained by Christian M Page. Last updated 5 months ago.
softwaretechnologysequencingwholegenome
2 stars 5.15 score 3 scriptsbioc
podkat:Position-Dependent Kernel Association Test
This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.
Maintained by Ulrich Bodenhofer. Last updated 5 months ago.
geneticswholegenomeannotationvariantannotationsequencingdataimportcurlbzip2xz-utilszlibcpp
5.02 score 6 scriptsbioc
katdetectr:Detection, Characterization and Visualization of Kataegis in Sequencing Data
Kataegis refers to the occurrence of regional hypermutation and is a phenomenon observed in a wide range of malignancies. Using changepoint detection katdetectr aims to identify putative kataegis foci from common data-formats housing genomic variants. Katdetectr has shown to be a robust package for the detection, characterization and visualization of kataegis.
Maintained by Daan Hazelaar. Last updated 5 months ago.
wholegenomesoftwaresnpsequencingclassificationvariantannotation
5 stars 5.00 score 4 scriptsbioc
RVS:Computes estimates of the probability of related individuals sharing a rare variant
Rare Variant Sharing (RVS) implements tests of association and linkage between rare genetic variant genotypes and a dichotomous phenotype, e.g. a disease status, in family samples. The tests are based on probabilities of rare variant sharing by relatives under the null hypothesis of absence of linkage and association between the rare variants and the phenotype and apply to single variants or multiple variants in a region (e.g. gene-based test).
Maintained by Alexandre Bureau. Last updated 5 months ago.
immunooncologygeneticsgenomewideassociationvariantdetectionexomeseqwholegenome
4.78 score 9 scriptsbioc
supersigs:Supervised mutational signatures
Generate SuperSigs (supervised mutational signatures) from single nucleotide variants in the cancer genome. Functions included in the package allow the user to learn supervised mutational signatures from their data and apply them to new data. The methodology is based on the one described in Afsari (2021, ELife).
Maintained by Albert Kuo. Last updated 5 months ago.
featureextractionclassificationregressionsequencingwholegenomesomaticmutation
3 stars 4.78 score 3 scriptsbioc
Summix:Summix2: A suite of methods to estimate, adjust, and leverage substructure in genetic summary data
This package contains the Summix2 method for estimating and adjusting for substructure in genetic summary allele frequency data. The function summix() estimates reference group proportions using a mixture model. The adjAF() function produces adjusted allele frequencies for an observed group with reference group proportions matching a target individual or sample. The summix_local() function estimates local ancestry mixture proportions and performs selection scans in genetic summary data.
Maintained by Audrey Hendricks. Last updated 5 months ago.
statisticalmethodwholegenomegenetics
4.62 score 14 scriptsbioc
GeneBreak:Gene Break Detection
Recurrent breakpoint gene detection on copy number aberration profiles.
Maintained by Evert van den Broek. Last updated 5 months ago.
acghcopynumbervariationdnaseqgeneticssequencingwholegenomevisualization
2 stars 4.60 score 6 scriptsbioc
methInheritSim:Simulating Whole-Genome Inherited Bisulphite Sequencing Data
Simulate a multigeneration methylation case versus control experiment with inheritance relation using a real control dataset.
Maintained by Pascal Belleau. Last updated 5 months ago.
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencingbisulphite-sequencinginheritancemethylationsimulation
1 stars 4.60 score 1 scriptsbioc
methylInheritance:Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect
Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.
Maintained by Astrid DeschĂȘnes. Last updated 5 months ago.
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencinganalysisbioconductorbioinformaticscpgdifferentially-methylated-elementsinheritancemonte-carlo-samplingpermutation
4.60 score 1 scriptsbioc
iCNV:Integrated Copy Number Variation detection
Integrative copy number variation (CNV) detection from multiple platform and experimental design.
Maintained by Zilu Zhou. Last updated 5 months ago.
immunooncologyexomeseqwholegenomesnpcopynumbervariationhiddenmarkovmodel
4.30 score 5 scriptsbioc
Uniquorn:Identification of cancer cell lines based on their weighted mutational/ variational fingerprint
'Uniquorn' enables users to identify cancer cell lines. Cancer cell line misidentification and cross-contamination reprents a significant challenge for cancer researchers. The identification is vital and in the frame of this package based on the locations/ loci of somatic and germline mutations/ variations. The input format is vcf/ vcf.gz and the files have to contain a single cancer cell line sample (i.e. a single member/genotype/gt column in the vcf file).
Maintained by Raik Otto. Last updated 5 months ago.
immunooncologystatisticalmethodwholegenomeexomeseq
4.30 scorebioc
CNAnorm:A normalization method for Copy Number Aberration in cancer samples
Performs ratio, GC content correction and normalization of data obtained using low coverage (one read every 100-10,000 bp) high troughput sequencing. It performs a "discrete" normalization looking for the ploidy of the genome. It will also provide tumour content if at least two ploidy states can be found.
Maintained by Stefano Berri. Last updated 5 months ago.
copynumbervariationsequencingcoveragenormalizationwholegenomednaseqgenomicvariationfortran
4.30 score 6 scriptsbioc
GenomAutomorphism:Compute the automorphisms between DNA's Abelian group representations
This is a R package to compute the automorphisms between pairwise aligned DNA sequences represented as elements from a Genomic Abelian group. In a general scenario, from genomic regions till the whole genomes from a given population (from any species or close related species) can be algebraically represented as a direct sum of cyclic groups or more specifically Abelian p-groups. Basically, we propose the representation of multiple sequence alignments of length N bp as element of a finite Abelian group created by the direct sum of homocyclic Abelian group of prime-power order.
Maintained by Robersy Sanchez. Last updated 3 months ago.
mathematicalbiologycomparativegenomicsfunctionalgenomicsmultiplesequencealignmentwholegenomegenetic-codegenetic-code-algebragenomegenome-algebra
4.30 score 9 scriptsbioc
compSPOT:compSPOT: Tool for identifying and comparing significantly mutated genomic hotspots
Clonal cell groups share common mutations within cancer, precancer, and even clinically normal appearing tissues. The frequency and location of these mutations may predict prognosis and cancer risk. It has also been well established that certain genomic regions have increased sensitivity to acquiring mutations. Mutation-sensitive genomic regions may therefore serve as markers for predicting cancer risk. This package contains multiple functions to establish significantly mutated hotspots, compare hotspot mutation burden between samples, and perform exploratory data analysis of the correlation between hotspot mutation burden and personal risk factors for cancer, such as age, gender, and history of carcinogen exposure. This package allows users to identify robust genomic markers to help establish cancer risk.
Maintained by Sydney Grant. Last updated 5 months ago.
softwaretechnologysequencingdnaseqwholegenomeclassificationsinglecellsurvivalmultiplecomparison
4.00 score 3 scriptsbioc
seq.hotSPOT:Targeted sequencing panel design based on mutation hotspots
seq.hotSPOT provides a resource for designing effective sequencing panels to help improve mutation capture efficacy for ultradeep sequencing projects. Using SNV datasets, this package designs custom panels for any tissue of interest and identify the genomic regions likely to contain the most mutations. Establishing efficient targeted sequencing panels can allow researchers to study mutation burden in tissues at high depth without the economic burden of whole-exome or whole-genome sequencing. This tool was developed to make high-depth sequencing panels to study low-frequency clonal mutations in clinically normal and cancerous tissues.
Maintained by Sydney Grant. Last updated 5 months ago.
softwaretechnologysequencingdnaseqwholegenome
4.00 score 3 scriptsbioc
mbQTL:mbQTL: A package for SNP-Taxa mGWAS analysis
mbQTL is a statistical R package for simultaneous 16srRNA,16srDNA (microbial) and variant, SNP, SNV (host) relationship, correlation, regression studies. We apply linear, logistic and correlation based statistics to identify the relationships of taxa, genus, species and variant, SNP, SNV in the infected host. We produce various statistical significance measures such as P values, FDR, BC and probability estimation to show significance of these relationships. Further we provide various visualization function for ease and clarification of the results of these analysis. The package is compatible with dataframe, MRexperiment and text formats.
Maintained by Mercedeh Movassagh. Last updated 5 months ago.
snpmicrobiomewholegenomemetagenomicsstatisticalmethodregression
1 stars 4.00 score 3 scriptsbioc
RareVariantVis:A suite for analysis of rare genomic variants in whole genome sequencing data
Second version of RareVariantVis package aims to provide comprehensive information about rare variants for your genome data. It annotates, filters and presents genomic variants (especially rare ones) in a global, per chromosome way. For discovered rare variants CRISPR guide RNAs are designed, so the user can plan further functional studies. Large structural variants, including copy number variants are also supported. Package accepts variants directly from variant caller - for example GATK or Speedseq. Output of package are lists of variants together with adequate visualization. Visualization of variants is performed in two ways - standard that outputs png figures and interactive that uses JavaScript d3 package. Interactive visualization allows to analyze trio/family data, for example in search for causative variants in rare Mendelian diseases, in point-and-click interface. The package includes homozygous region caller and allows to analyse whole human genomes in less than 30 minutes on a desktop computer. RareVariantVis disclosed novel causes of several rare monogenic disorders, including one with non-coding causative variant - keratolythic winter erythema.
Maintained by Tomasz Stokowy. Last updated 5 months ago.
genomicvariationsequencingwholegenome
3.90 score 1 scriptsbioc
BadRegionFinder:BadRegionFinder: an R/Bioconductor package for identifying regions with bad coverage
BadRegionFinder is a package for identifying regions with a bad, acceptable and good coverage in sequence alignment data available as bam files. The whole genome may be considered as well as a set of target regions. Various visual and textual types of output are available.
Maintained by Sarah Sandmann. Last updated 2 months ago.
coveragesequencingalignmentwholegenomeclassification
3.60 score 1 scriptsbioc
sscu:Strength of Selected Codon Usage
The package calculates the indexes for selective stength in codon usage in bacteria species. (1) The package can calculate the strength of selected codon usage bias (sscu, also named as s_index) based on Paul Sharp's method. The method take into account of background mutation rate, and focus only on four pairs of codons with universal translational advantages in all bacterial species. Thus the sscu index is comparable among different species. (2) The package can detect the strength of translational accuracy selection by Akashi's test. The test tabulating all codons into four categories with the feature as conserved/variable amino acids and optimal/non-optimal codons. (3) Optimal codon lists (selected codons) can be calculated by either op_highly function (by using the highly expressed genes compared with all genes to identify optimal codons), or op_corre_CodonW/op_corre_NCprime function (by correlative method developed by Hershberg & Petrov). Users will have a list of optimal codons for further analysis, such as input to the Akashi's test. (4) The detailed codon usage information, such as RSCU value, number of optimal codons in the highly/all gene set, as well as the genomic gc3 value, can be calculate by the optimal_codon_statistics and genomic_gc3 function. (5) Furthermore, we added one test function low_frequency_op in the package. The function try to find the low frequency optimal codons, among all the optimal codons identified by the op_highly function.
Maintained by Yu Sun. Last updated 5 months ago.
geneticsgeneexpressionwholegenome
2.30 score 1 scripts