R-universe search: needs:BSgenome

bioc

Gviz:Plotting data and annotation information along genomic coordinates

Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.

Maintained by Robert Ivanek. Last updated 5 months ago.

visualization microarray sequencing

79 stars 13.05 score 1.4k scripts 46 dependents

bioc

TFBSTools:Software Package for Transcription Factor Binding Site (TFBS) Analysis

TFBSTools is a package for the analysis and manipulation of transcription factor binding sites. It includes matrices conversion between Position Frequency Matirx (PFM), Position Weight Matirx (PWM) and Information Content Matrix (ICM). It can also scan putative TFBS from sequence/alignment, query JASPAR database and provides a wrapper of de novo motif discovery software.

Maintained by Ge Tan. Last updated 21 days ago.

motifannotation generegulation motifdiscovery transcription alignment

28 stars 12.36 score 1.1k scripts 18 dependents

bioc

bsseq:Analyze, manage and store whole-genome methylation data

A collection of tools for analyzing and visualizing whole-genome methylation data from sequencing. This includes whole-genome bisulfite sequencing and Oxford nanopore data.

Maintained by Kasper Daniel Hansen. Last updated 4 months ago.

dnamethylation cpp

37 stars 12.26 score 676 scripts 15 dependents

bioc

ggbio:Visualization tools for genomic data

The ggbio package extends and specializes the grammar of graphics for biological data. The graphics are designed to answer common scientific questions, in particular those often asked of high throughput genomics data. All core Bioconductor data structures are supported, where appropriate. The package supports detailed views of particular genomic regions, as well as genome-wide overviews. Supported overviews include ideograms and grand linear views. High-level plots include sequence fragment length, edge-linked interval to data view, mismatch pileup, and several splicing summaries.

Maintained by Michael Lawrence. Last updated 5 months ago.

infrastructure visualization

111 stars 12.23 score 734 scripts 16 dependents

bioc

VariantAnnotation:Annotation of Genetic Variants

Annotate variants, compute amino acid coding changes, predict coding outcomes.

Maintained by Bioconductor Package Maintainer. Last updated 3 months ago.

dataimport sequencing snp annotation genetics variantannotation curl bzip2 xz-utils zlib

11.39 score 1.9k scripts 152 dependents

bioc

karyoploteR:Plot customizable linear genomes displaying arbitrary data

karyoploteR creates karyotype plots of arbitrary genomes and offers a complete set of functions to plot arbitrary data on them. It mimicks many R base graphics functions coupling them with a coordinate change function automatically mapping the chromosome and data coordinates into the plot coordinates. In addition to the provided data plotting functions, it is easy to add new ones.

Maintained by Bernat Gel. Last updated 5 months ago.

visualization copynumbervariation sequencing coverage dnaseq chipseq methylseq dataimport onechannel bioconductor bioinformatics data-visualization genome genomics-visualization plotting-in-r

307 stars 11.25 score 656 scripts 4 dependents

bioc

genomation:Summary, annotation and visualization of genomic data

A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.

Maintained by Altuna Akalin. Last updated 5 months ago.

annotation sequencing visualization cpgisland cpp

76 stars 11.13 score 738 scripts 5 dependents

bioc

ORFik:Open Reading Frames in Genomics

R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.

Maintained by Haakon Tjeldnes. Last updated 1 months ago.

immunooncology software sequencing riboseq rnaseq functionalgenomics coverage alignment dataimport cpp

33 stars 10.56 score 115 scripts 2 dependents

bioc

derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach

This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.

Maintained by Leonardo Collado-Torres. Last updated 4 months ago.

differentialexpression sequencing rnaseq chipseq differentialpeakcalling software immunooncology coverage annotation-agnostic bioconductor derfinder

42 stars 10.03 score 78 scripts 6 dependents

bioc

PureCN:Copy number calling and SNV classification using targeted short read sequencing

This package estimates tumor purity, copy number, and loss of heterozygosity (LOH), and classifies single nucleotide variants (SNVs) by somatic status and clonality. PureCN is designed for targeted short read sequencing data, integrates well with standard somatic variant detection and copy number pipelines, and has support for tumor samples without matching normal samples.

Maintained by Markus Riester. Last updated 3 days ago.

copynumbervariation software sequencing variantannotation variantdetection coverage immunooncology bioconductor-package cell-free-dna copy-number loh tumor-heterogeneity tumor-mutational-burden tumor-purity

132 stars 9.88 score 40 scripts

bioc

GenVisR:Genomic Visualizations in R

Produce highly customizable publication quality graphics for genomic data primarily at the cohort level.

Maintained by Zachary Skidmore. Last updated 5 months ago.

infrastructure datarepresentation classification dnaseq

217 stars 9.87 score 76 scripts

bioc

annotatr:Annotation of Genomic Regions to Genomic Annotations

Given a set of genomic sites/regions (e.g. ChIP-seq peaks, CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often of interest to investigate the intersecting genomic annotations. Such annotations include those relating to gene models (promoters, 5'UTRs, exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG shelves), or regulatory sequences such as enhancers. The annotatr package provides an easy way to summarize and visualize the intersection of genomic sites/regions with genomic annotations.

Maintained by Raymond G. Cavalcante. Last updated 5 months ago.

software annotation genomeannotation functionalgenomics visualization genome-annotation

26 stars 9.76 score 246 scripts 5 dependents

bioc

recount:Explore and download data from the recount project

Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.

Maintained by Leonardo Collado-Torres. Last updated 4 months ago.

coverage differentialexpression geneexpression rnaseq sequencing software dataimport immunooncology annotation-agnostic bioconductor count derfinder deseq2 exon gene human illumina junction recount

41 stars 9.57 score 498 scripts 3 dependents

bioc

GenomicInteractions:Utilities for handling genomic interaction data

Utilities for handling genomic interaction data such as ChIA-PET or Hi-C, annotating genomic features with interaction information, and producing plots and summary statistics.

Maintained by Liz Ing-Simmons. Last updated 5 months ago.

software infrastructure dataimport datarepresentation hic

7 stars 9.31 score 162 scripts 5 dependents

bioc

IsoformSwitchAnalyzeR:Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data

Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.

Maintained by Kristoffer Vitting-Seerup. Last updated 5 months ago.

geneexpression transcription alternativesplicing differentialexpression differentialsplicing visualization statisticalmethod transcriptomevariant biomedicalinformatics functionalgenomics systemsbiology transcriptomics rnaseq annotation functionalprediction geneprediction dataimport multiplecomparison batcheffect immunooncology

108 stars 9.26 score 125 scripts

bioc

bambu:Context-Aware Transcript Quantification from Long Read RNA-Seq data

bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.

Maintained by Ying Chen. Last updated 2 months ago.

alignment coverage differentialexpression featureextraction geneexpression genomeannotation genomeassembly immunooncology longread multiplecomparison normalization rnaseq regression sequencing software transcription transcriptomics bambu bioconductor long-reads nanopore nanopore-sequencing rna-seq rna-seq-analysis transcript-quantification transcript-reconstruction cpp

203 stars 9.04 score 91 scripts 1 dependents

bioc

regioneR:Association analysis of genomic regions based on permutation tests

regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other genomic features.

Maintained by Bernat Gel. Last updated 5 months ago.

genetics chipseq dnaseq methylseq copynumbervariation

9.01 score 2.7k scripts 21 dependents

bioc

motifbreakR:A Package For Predicting The Disruptiveness Of Single Nucleotide Polymorphisms On Transcription Factor Binding Sites

We introduce motifbreakR, which allows the biologist to judge in the first place whether the sequence surrounding the polymorphism is a good match, and in the second place how much information is gained or lost in one allele of the polymorphism relative to another. MotifbreakR is both flexible and extensible over previous offerings; giving a choice of algorithms for interrogation of genomes with motifs from public sources that users can choose from; these are 1) a weighted-sum probability matrix, 2) log-probabilities, and 3) weighted by relative entropy. MotifbreakR can predict effects for novel or previously described variants in public databases, making it suitable for tasks beyond the scope of its original design. Lastly, it can be used to interrogate any genome curated within Bioconductor (currently there are 32 species, a total of 109 versions).

Maintained by Simon Gert Coetzee. Last updated 5 months ago.

chipseq visualization motifannotation transcription

28 stars 8.89 score 103 scripts

bioc

ChIPpeakAnno:Batch annotation of the peaks identified from either ChIP-seq, ChIP-chip experiments, or any experiments that result in large number of genomic interval data

The package encompasses a range of functions for identifying the closest gene, exon, miRNA, or custom features—such as highly conserved elements and user-supplied transcription factor binding sites. Additionally, users can retrieve sequences around the peaks and obtain enriched Gene Ontology (GO) or Pathway terms. In version 2.0.5 and beyond, new functionalities have been introduced. These include features for identifying peaks associated with bi-directional promoters along with summary statistics (peaksNearBDP), summarizing motif occurrences in peaks (summarizePatternInPeaks), and associating additional identifiers with annotated peaks or enrichedGO (addGeneIDs). The package integrates with various other packages such as biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest, and stat to enhance its analytical capabilities.

Maintained by Jianhong Ou. Last updated 3 months ago.

annotation chipseq chipchip

8.75 score 584 scripts 6 dependents

bioc

trackViewer:A R/Bioconductor package with web interface for drawing elegant interactive tracks or lollipop plot to facilitate integrated analysis of multi-omics data

Visualize mapped reads along with annotation as track layers for NGS dataset such as ChIP-seq, RNA-seq, miRNA-seq, DNA-seq, SNPs and methylation data.

Maintained by Jianhong Ou. Last updated 6 days ago.

visualization

8.68 score 145 scripts 2 dependents

bioc

QuasR:Quantify and Annotate Short Reads in R

This package provides a framework for the quantification and analysis of Short Reads. It covers a complete workflow starting from raw sequence reads, over creation of alignments and quality control plots, to the quantification of genomic regions of interest. Read alignments are either generated through Rbowtie (data from DNA/ChIP/ATAC/Bis-seq experiments) or Rhisat2 (data from RNA-seq experiments that require spliced alignments), or can be provided in the form of bam files.

Maintained by Michael Stadler. Last updated 1 months ago.

genetics preprocessing sequencing chipseq rnaseq methylseq coverage alignment qualitycontrol immunooncology curl bzip2 xz-utils zlib cpp

6 stars 8.63 score 79 scripts 1 dependents

bioc

FRASER:Find RAre Splicing Events in RNA-Seq Data

Detection of rare aberrant splicing events in transcriptome profiles. Read count ratio expectations are modeled by an autoencoder to control for confounding factors in the data. Given these expectations, the ratios are assumed to follow a beta-binomial distribution with a junction specific dispersion. Outlier events are then identified as read-count ratios that deviate significantly from this distribution. FRASER is able to detect alternative splicing, but also intron retention. The package aims to support diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.

Maintained by Christian Mertes. Last updated 5 months ago.

rnaseq alternativesplicing sequencing software genetics coverage aberrant-splicing diagnostics outlier-detection rare-disease rna-seq splicing openblas cpp

44 stars 8.53 score 155 scripts

bioc

TitanCNA:Subclonal copy number and LOH prediction from whole genome sequencing of tumours

Hidden Markov model to segment and predict regions of subclonal copy number alterations (CNA) and loss of heterozygosity (LOH), and estimate cellular prevalence of clonal clusters in tumour whole genome sequencing data.

Maintained by Gavin Ha. Last updated 5 months ago.

sequencing wholegenome dnaseq exomeseq statisticalmethod copynumbervariation hiddenmarkovmodel genetics genomicvariation immunooncology 10x-genomics copy-number-variation genome-sequencing hmm tumor-heterogeneity

97 stars 8.47 score 68 scripts

bioc

igvR:igvR: integrative genomics viewer

Access to igv.js, the Integrative Genomics Viewer running in a web browser.

Maintained by Arkadiusz Gladki. Last updated 5 months ago.

visualization thirdpartyclient genomebrowsers

45 stars 8.33 score 118 scripts

bioc

crisprDesign:Comprehensive design of CRISPR gRNAs for nucleases and base editors

Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.

Maintained by Jean-Philippe Fortin. Last updated 28 days ago.

crispr functionalgenomics genetarget bioconductor bioconductor-package crispr-cas9 crispr-design crispr-target genomics-analysis grna grna-sequence grna-sequences sgrna sgrna-design

22 stars 8.28 score 80 scripts 3 dependents

bioc

motifmatchr:Fast Motif Matching in R

Quickly find motif matches for many motifs and many sequences. Wraps C++ code from the MOODS motif calling library, which was developed by Pasi Rastas, Janne Korhonen, and Petri Martinmäki.

Maintained by Alicia Schep. Last updated 5 months ago.

motifannotation cpp

8.11 score 722 scripts 5 dependents

bioc

monaLisa:Binned Motif Enrichment Analysis and Visualization

Useful functions to work with sequence motifs in the analysis of genomics data. These include methods to annotate genomic regions or sequences with predicted motif hits and to identify motifs that drive observed changes in accessibility or expression. Functions to produce informative visualizations of the obtained results are also provided.

Maintained by Michael Stadler. Last updated 9 days ago.

motifannotation visualization featureextraction epigenetics

40 stars 8.10 score 53 scripts

bioc

FLAMES:FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data

Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.

Maintained by Changqing Wang. Last updated 2 days ago.

rnaseq singlecell transcriptomics dataimport differentialsplicing alternativesplicing geneexpression longread zlib curl bzip2 xz-utils cpp

33 stars 8.04 score 12 scripts

bioc

biovizBase:Basic graphic utilities for visualization of genomic data.

The biovizBase package is designed to provide a set of utilities, color schemes and conventions for genomic data. It serves as the base for various high-level packages for biological data visualization. This saves development effort and encourages consistency.

Maintained by Michael Lawrence. Last updated 5 months ago.

infrastructure visualization preprocessing

8.03 score 273 scripts 74 dependents

bioc

motifStack:Plot stacked logos for single or multiple DNA, RNA and amino acid sequence

The motifStack package is designed for graphic representation of multiple motifs with different similarity scores. It works with both DNA/RNA sequence motif and amino acid sequence motif. In addition, it provides the flexibility for users to customize the graphic parameters such as the font type and symbol colors.

Maintained by Jianhong Ou. Last updated 3 months ago.

sequencematching visualization sequencing microarray alignment chipchip chipseq motifannotation dataimport

7.93 score 188 scripts 6 dependents

bioc

signeR:Empirical Bayesian approach to mutational signature discovery

The signeR package provides an empirical Bayesian approach to mutational signature discovery. It is designed to analyze single nucleotide variation (SNV) counts in cancer genomes, but can also be applied to other features as well. Functionalities to characterize signatures or genome samples according to exposure patterns are also provided.

Maintained by Renan Valieris. Last updated 5 months ago.

genomicvariation somaticmutation statisticalmethod visualization bioconductor bioinformatics openblas cpp

13 stars 7.67 score 22 scripts

bioc

MIRA:Methylation-Based Inference of Regulatory Activity

DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for a given region set with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for the region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of region sets to be leveraged for novel insight into the regulatory state of DNA methylation datasets.

Maintained by John Lawson. Last updated 5 months ago.

immunooncology dnamethylation generegulation genomeannotation systemsbiology functionalgenomics chipseq methylseq sequencing epigenetics coverage

12 stars 7.56 score 7 scripts 1 dependents

bioc

methrix:Fast and efficient summarization of generic bedGraph files from Bisufite sequencing

Bedgraph files generated by Bisulfite pipelines often come in various flavors. Critical downstream step requires summarization of these files into methylation/coverage matrices. This step of data aggregation is done by Methrix, including many other useful downstream functions.

Maintained by Anand Mayakonda. Last updated 5 months ago.

dnamethylation sequencing coverage bedgraph bioinformatics dna-methylation

32 stars 7.53 score 39 scripts 1 dependents

bioc

EpiCompare:Comparison, Benchmarking & QC of Epigenomic Datasets

EpiCompare is used to compare and analyse epigenetic datasets for quality control and benchmarking purposes. The package outputs an HTML report consisting of three sections: (1. General metrics) Metrics on peaks (percentage of blacklisted and non-standard peaks, and peak widths) and fragments (duplication rate) of samples, (2. Peak overlap) Percentage and statistical significance of overlapping and non-overlapping peaks. Also includes upset plot and (3. Functional annotation) functional annotation (ChromHMM, ChIPseeker and enrichment analysis) of peaks. Also includes peak enrichment around TSS.

Maintained by Hiranyamaya Dash. Last updated 2 months ago.

epigenetics genetics qualitycontrol chipseq multiplecomparison functionalgenomics atacseq dnaseseq benchmark benchmarking bioconductor bioconductor-package comparison html interactive-reporting

15 stars 7.49 score 46 scripts

bioc

CAGEfightR:Analysis of Cap Analysis of Gene Expression (CAGE) data using Bioconductor

CAGE is a widely used high throughput assay for measuring transcription start site (TSS) activity. CAGEfightR is an R/Bioconductor package for performing a wide range of common data analysis tasks for CAGE and 5'-end data in general. Core functionality includes: import of CAGE TSSs (CTSSs), tag (or unidirectional) clustering for TSS identification, bidirectional clustering for enhancer identification, annotation with transcript and gene models, correlation of TSS and enhancer expression, calculation of TSS shapes, quantification of CAGE expression as expression matrices and genome brower visualization.

Maintained by Malte Thodberg. Last updated 5 months ago.

software transcription coverage geneexpression generegulation peakdetection dataimport datarepresentation transcriptomics sequencing annotation genomebrowsers normalization preprocessing visualization

8 stars 7.46 score 67 scripts 1 dependents

bioc

ELMER:Inferring Regulatory Element Landscapes and Transcription Factor Networks Using Cancer Methylomes

ELMER is designed to use DNA methylation and gene expression from a large number of samples to infere regulatory element landscape and transcription factor network in primary tissue.

Maintained by Tiago Chedraoui Silva. Last updated 5 months ago.

dnamethylation geneexpression motifannotation software generegulation transcription network

7.42 score 176 scripts

bioc

methylSig:MethylSig: Differential Methylation Testing for WGBS and RRBS Data

MethylSig is a package for testing for differentially methylated cytosines (DMCs) or regions (DMRs) in whole-genome bisulfite sequencing (WGBS) or reduced representation bisulfite sequencing (RRBS) experiments. MethylSig uses a beta binomial model to test for significant differences between groups of samples. Several options exist for either site-specific or sliding window tests, and variance estimation.

Maintained by Raymond G. Cavalcante. Last updated 5 months ago.

dnamethylation differentialmethylation epigenetics regression methylseq differential-methylation dna-methylation

18 stars 7.40 score 23 scripts

bioc

chromVAR:Chromatin Variation Across Regions

Determine variation in chromatin accessibility across sets of annotations or peaks. Designed primarily for single-cell or sparse chromatin accessibility data, e.g. from scATAC-seq or sparse bulk ATAC or DNAse-seq experiments.

Maintained by Alicia Schep. Last updated 5 months ago.

singlecell sequencing generegulation immunooncology cpp

7.31 score 772 scripts

bioc

regionReport:Generate HTML or PDF reports for a set of genomic regions or DESeq2/edgeR results

Generate HTML or PDF reports to explore a set of regions such as the results from annotation-agnostic expression analysis of RNA-seq data at base-pair resolution performed by derfinder. You can also create reports for DESeq2 or edgeR results.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq software visualization transcription coverage reportwriting differentialmethylation differentialpeakcalling immunooncology qualitycontrol bioconductor derfinder deseq2 edger regionreport rmarkdown

9 stars 7.22 score 46 scripts

bioc

CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems

The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.

Maintained by Lihua Julie Zhu. Last updated 23 days ago.

immunooncology generegulation sequencematching crispr

7.18 score 51 scripts 2 dependents

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

7.18 score 251 scripts 1 dependents

bioc

DiffBind:Differential Binding Analysis of ChIP-Seq Peak Data

Compute differentially bound sites from multiple ChIP-seq experiments using affinity (quantitative) data. Also enables occupancy (overlap) analysis and plotting functions.

Maintained by Rory Stark. Last updated 2 months ago.

sequencing chipseq atacseq dnaseseq methylseq ripseq differentialpeakcalling differentialmethylation generegulation histonemodification peakdetection biomedicalinformatics cellbiology multiplecomparison normalization reportwriting epigenetics functionalgenomics curl bzip2 xz-utils zlib cpp

7.13 score 512 scripts 2 dependents

bioc

ATACseqQC:ATAC-seq Quality Control

ATAC-seq, an assay for Transposase-Accessible Chromatin using sequencing, is a rapid and sensitive method for chromatin accessibility analysis. It was developed as an alternative method to MNase-seq, FAIRE-seq and DNAse-seq. Comparing to the other methods, ATAC-seq requires less amount of the biological samples and time to process. In the process of analyzing several ATAC-seq dataset produced in our labs, we learned some of the unique aspects of the quality assessment for ATAC-seq data.To help users to quickly assess whether their ATAC-seq experiment is successful, we developed ATACseqQC package partially following the guideline published in Nature Method 2013 (Greenleaf et al.), including diagnostic plot of fragment size distribution, proportion of mitochondria reads, nucleosome positioning pattern, and CTCF or other Transcript Factor footprints.

Maintained by Jianhong Ou. Last updated 3 months ago.

sequencing dnaseq atacseq generegulation qualitycontrol coverage nucleosomepositioning immunooncology

7.12 score 146 scripts 1 dependents

bioc

RiboCrypt:Interactive visualization in genomics

R Package for interactive visualization and browsing NGS data. It contains a browser for both transcript and genomic coordinate view. In addition a QC and general metaplots are included, among others differential translation plots and gene expression plots. The package is still under development.

Maintained by Michal Swirski. Last updated 5 days ago.

software sequencing riboseq rnaseq

5 stars 7.08 score 22 scripts

bioc

cardelino:Clone Identification from Single Cell Data

Methods to infer clonal tree configuration for a population of cells using single-cell RNA-seq data (scRNA-seq), and possibly other data modalities. Methods are also provided to assign cells to inferred clones and explore differences in gene expression between clones. These methods can flexibly integrate information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. A flexible beta-binomial error model that accounts for stochastic dropout events as well as systematic allelic imbalance is used.

Maintained by Davis McCarthy. Last updated 5 months ago.

singlecell rnaseq visualization transcriptomics geneexpression sequencing software exomeseq clonal-clustering gibbs-sampling scrna-seq single-cell somatic-mutations

61 stars 7.05 score 62 scripts

bioc

DSS:Dispersion shrinkage for sequencing data

DSS is an R library performing differntial analysis for count-based sequencing data. It detectes differentially expressed genes (DEGs) from RNA-seq, and differentially methylated loci or regions (DML/DMRs) from bisulfite sequencing (BS-seq). The core of DSS is a new dispersion shrinkage method for estimating the dispersion parameter from Gamma-Poisson or Beta-Binomial distributions.

Maintained by Hao Wu. Last updated 5 months ago.

sequencing rnaseq dnamethylation geneexpression differentialexpression differentialmethylation

7.02 score 248 scripts 5 dependents

bioc

COCOA:Coordinate Covariation Analysis

COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.

Maintained by John Lawson. Last updated 5 months ago.

epigenetics dnamethylation atacseq dnaseseq methylseq methylationarray principalcomponent genomicvariation generegulation genomeannotation systemsbiology functionalgenomics chipseq sequencing immunooncology dna-methylation pca

10 stars 7.02 score 21 scripts

bioc

musicatk:Mutational Signature Comprehensive Analysis Toolkit

Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.

Maintained by Joshua D. Campbell. Last updated 5 months ago.

software biologicalquestion somaticmutation variantannotation

13 stars 6.97 score 20 scripts

bioc

NanoMethViz:Visualise methylation data from Oxford Nanopore sequencing

NanoMethViz is a toolkit for visualising methylation data from Oxford Nanopore sequencing. It can be used to explore methylation patterns from reads derived from Oxford Nanopore direct DNA sequencing with methylation called by callers including nanopolish, f5c and megalodon. The plots in this package allow the visualisation of methylation profiles aggregated over experimental groups and across classes of genomic features.

Maintained by Shian Su. Last updated 23 days ago.

software longread visualization differentialmethylation dnamethylation epigenetics dataimport zlib cpp

26 stars 6.95 score 11 scripts

bioc

psichomics:Graphical Interface for Alternative Splicing Quantification, Analysis and Visualisation

Interactive R package with an intuitive Shiny-based graphical interface for alternative splicing quantification and integrative analyses of alternative splicing and gene expression based on The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression project (GTEx), Sequence Read Archive (SRA) and user-provided data. The tool interactively performs survival, dimensionality reduction and median- and variance-based differential splicing and gene expression analyses that benefit from the incorporation of clinical and molecular sample-associated features (such as tumour stage or survival). Interactive visual access to genomic mapping and functional annotation of selected alternative splicing events is also included.

Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.

sequencing rnaseq alternativesplicing differentialsplicing transcription gui principalcomponent survival biomedicalinformatics transcriptomics immunooncology visualization multiplecomparison geneexpression differentialexpression alternative-splicing bioconductor data-analyses differential-gene-expression differential-splicing-analysis gene-expression gtex recount2 rna-seq-data splicing-quantification sra tcga vast-tools cpp

36 stars 6.95 score 31 scripts

bioc

GenomicFiles:Distributed computing by file or by range

This package provides infrastructure for parallel computations distributed 'by file' or 'by range'. User defined MAPPER and REDUCER functions provide added flexibility for data combination and manipulation.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

genetics infrastructure dataimport sequencing coverage

6.86 score 89 scripts 16 dependents

bioc

SomaticSignatures:Somatic Signatures

The SomaticSignatures package identifies mutational signatures of single nucleotide variants (SNVs). It provides a infrastructure related to the methodology described in Nik-Zainal (2012, Cell), with flexibility in the matrix decomposition algorithms.

Maintained by Julian Gehring. Last updated 5 months ago.

sequencing somaticmutation visualization clustering genomicvariation statisticalmethod

22 stars 6.85 score 54 scripts 1 dependents

bioc

maser:Mapping Alternative Splicing Events to pRoteins

This package provides functionalities for downstream analysis, annotation and visualizaton of alternative splicing events generated by rMATS.

Maintained by Diogo F.T. Veiga. Last updated 5 months ago.

alternativesplicing transcriptomics visualization

17 stars 6.74 score 18 scripts

bioc

chimeraviz:Visualization tools for gene fusions

chimeraviz manages data from fusion gene finders and provides useful visualization tools.

Maintained by Stian Lågstad. Last updated 5 months ago.

infrastructure alignment

37 stars 6.71 score 14 scripts

bioc

epiregulon:Gene regulatory network inference from single cell epigenomic data

Gene regulatory networks model the underlying gene regulation hierarchies that drive gene expression and observed phenotypes. Epiregulon infers TF activity in single cells by constructing a gene regulatory network (regulons). This is achieved through integration of scATAC-seq and scRNA-seq data and incorporation of public bulk TF ChIP-seq data. Links between regulatory elements and their target genes are established by computing correlations between chromatin accessibility and gene expressions.

Maintained by Xiaosai Yao. Last updated 23 days ago.

singlecell generegulation networkinference network geneexpression transcription genetarget cpp

14 stars 6.67 score 17 scripts

bioc

deepSNV:Detection of subclonal SNVs in deep sequencing data.

This package provides provides quantitative variant callers for detecting subclonal mutations in ultra-deep (>=100x coverage) sequencing experiments. The deepSNV algorithm is used for a comparative setup with a control experiment of the same loci and uses a beta-binomial model and a likelihood ratio test to discriminate sequencing errors and subclonal SNVs. The shearwater algorithm computes a Bayes classifier based on a beta-binomial model for variant calling with multiple samples for precisely estimating model parameters - such as local error rates and dispersion - and prior knowledge, e.g. from variation data bases such as COSMIC.

Maintained by Moritz Gerstung. Last updated 5 months ago.

geneticvariability snp sequencing genetics dataimport curl bzip2 xz-utils zlib cpp

6.53 score 38 scripts 1 dependents

bioc

ChAMP:Chip Analysis Methylation Pipeline for Illumina HumanMethylation450 and EPIC

The package includes quality control metrics, a selection of normalization methods and novel methods to identify differentially methylated regions and to highlight copy number alterations.

Maintained by Yuan Tian. Last updated 5 months ago.

microarray methylationarray normalization twochannel copynumber dnamethylation

6.50 score 278 scripts

bioc

SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data

SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.

Maintained by Guido Barzaghi. Last updated 8 days ago.

dnamethylation coverage nucleosomepositioning datarepresentation epigenetics methylseq qualitycontrol sequencing

2 stars 6.46 score 27 scripts

bioc

gwasurvivr:gwasurvivr: an R package for genome wide survival analysis

gwasurvivr is a package to perform survival analysis using Cox proportional hazard models on imputed genetic data.

Maintained by Abbas Rizvi. Last updated 5 months ago.

genomewideassociation survival regression genetics snp geneticvariability pharmacogenomics biomedicalinformatics

12 stars 6.43 score 75 scripts

bioc

SparseSignatures:SparseSignatures

Point mutations occurring in a genome can be divided into 96 categories based on the base being mutated, the base it is mutated into and its two flanking bases. Therefore, for any patient, it is possible to represent all the point mutations occurring in that patient's tumor as a vector of length 96, where each element represents the count of mutations for a given category in the patient. A mutational signature represents the pattern of mutations produced by a mutagen or mutagenic process inside the cell. Each signature can also be represented by a vector of length 96, where each element represents the probability that this particular mutagenic process generates a mutation of the 96 above mentioned categories. In this R package, we provide a set of functions to extract and visualize the mutational signatures that best explain the mutation counts of a large number of patients.

Maintained by Luca De Sano. Last updated 8 days ago.

biomedicalinformatics somaticmutation

11 stars 6.42 score 4 scripts

bioc

YAPSA:Yet Another Package for Signature Analysis

This package provides functions and routines for supervised analyses of mutational signatures (i.e., the signatures have to be known, cf. L. Alexandrov et al., Nature 2013 and L. Alexandrov et al., Bioaxiv 2018). In particular, the family of functions LCD (LCD = linear combination decomposition) can use optimal signature-specific cutoffs which takes care of different detectability of the different signatures. Moreover, the package provides different sets of mutational signatures, including the COSMIC and PCAWG SNV signatures and the PCAWG Indel signatures; the latter infering that with YAPSA, the concept of supervised analysis of mutational signatures is extended to Indel signatures. YAPSA also provides confidence intervals as computed by profile likelihoods and can perform signature analysis on a stratified mutational catalogue (SMC = stratify mutational catalogue) in order to analyze enrichment and depletion patterns for the signatures in different strata.

Maintained by Zuguang Gu. Last updated 5 months ago.

sequencing dnaseq somaticmutation visualization clustering genomicvariation statisticalmethod biologicalquestion

6.41 score 57 scripts

bioc

SpliceWiz:interactive analysis and visualization of alternative splicing in R

The analysis and visualization of alternative splicing (AS) events from RNA sequencing data remains challenging. SpliceWiz is a user-friendly and performance-optimized R package for AS analysis, by processing alignment BAM files to quantify read counts across splice junctions, IRFinder-based intron retention quantitation, and supports novel splicing event identification. We introduce a novel visualization for AS using normalized coverage, thereby allowing visualization of differential AS across conditions. SpliceWiz features a shiny-based GUI facilitating interactive data exploration of results including gene ontology enrichment. It is performance optimized with multi-threaded processing of BAM files and a new COV file format for fast recall of sequencing coverage. Overall, SpliceWiz streamlines AS analysis, enabling reliable identification of functionally relevant AS events for further characterization.

Maintained by Alex Chit Hei Wong. Last updated 20 days ago.

software transcriptomics rnaseq alternativesplicing coverage differentialsplicing differentialexpression gui sequencing cpp openmp

16 stars 6.41 score 8 scripts

bioc

dmrseq:Detection and inference of differentially methylated regions from Whole Genome Bisulfite Sequencing

This package implements an approach for scanning the genome to detect and perform accurate inference on differentially methylated regions from Whole Genome Bisulfite Sequencing data. The method is based on comparing detected regions to a pooled null distribution, that can be implemented even when as few as two samples per population are available. Region-level statistics are obtained by fitting a generalized least squares (GLS) regression model with a nested autoregressive correlated error structure for the effect of interest on transformed methylation proportions.

Maintained by Keegan Korthauer. Last updated 5 months ago.

immunooncology dnamethylation epigenetics multiplecomparison software sequencing differentialmethylation wholegenome regression functionalgenomics

6.39 score 59 scripts 1 dependents

bioc

RNAmodR:Detection of post-transcriptional modifications in high throughput sequencing data

RNAmodR provides classes and workflows for loading/aggregation data from high througput sequencing aimed at detecting post-transcriptional modifications through analysis of specific patterns. In addition, utilities are provided to validate and visualize the results. The RNAmodR package provides a core functionality from which specific analysis strategies can be easily implemented as a seperate package.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

software infrastructure workflowstep visualization sequencing alkanilineseq bioconductor modifications ribomethseq rna rnamodr

3 stars 6.39 score 9 scripts 3 dependents

bioc

gwascat:representing and modeling data in the EMBL-EBI GWAS catalog

Represent and model data in the EMBL-EBI GWAS catalog.

Maintained by VJ Carey. Last updated 6 days ago.

genetics

6.35 score 110 scripts 2 dependents

bioc

StructuralVariantAnnotation:Variant annotations for structural variants

StructuralVariantAnnotation provides a framework for analysis of structural variants within the Bioconductor ecosystem. This package contains contains useful helper functions for dealing with structural variants in VCF format. The packages contains functions for parsing VCFs from a number of popular callers as well as functions for dealing with breakpoints involving two separate genomic loci encoded as GRanges objects.

Maintained by Daniel Cameron. Last updated 5 months ago.

dataimport sequencing annotation genetics variantannotation

6.26 score 102 scripts 2 dependents

bioc

CopyNumberPlots:Create Copy-Number Plots using karyoploteR functionality

CopyNumberPlots have a set of functions extending karyoploteRs functionality to create beautiful, customizable and flexible plots of copy-number related data.

Maintained by Bernat Gel. Last updated 5 months ago.

visualization copynumbervariation coverage onechannel dataimport sequencing dnaseq bioconductor bioconductor-package bioinformatics copy-number-variation genomics genomics-visualization

6 stars 6.24 score 16 scripts 2 dependents

bioc

RAIDS:Accurate Inference of Genetic Ancestry from Cancer Sequences

This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.

Maintained by Pascal Belleau. Last updated 5 months ago.

genetics software sequencing wholegenome principalcomponent geneticvariability dimensionreduction biocviews ancestry cancer-genomics exome-sequencing genomics inference r-language rna-seq rna-sequencing whole-genome-sequencing

5 stars 6.23 score 19 scripts

bioc

ReportingTools:Tools for making reports in various formats

The ReportingTools software package enables users to easily display reports of analysis results generated from sources such as microarray and sequencing data. The package allows users to create HTML pages that may be viewed on a web browser such as Safari, or in other formats readable by programs such as Excel. Users can generate tables with sortable and filterable columns, make and display plots, and link table entries to other data sources such as NCBI or larger plots within the HTML page. Using the package, users can also produce a table of contents page to link various reports together for a particular project that can be viewed in a web browser. For more examples, please visit our site: http:// research-pub.gene.com/ReportingTools.

Maintained by Jason A. Hackney. Last updated 5 months ago.

immunooncology software visualization microarray rnaseq go datarepresentation genesetenrichment

6.23 score 93 scripts 1 dependents

bioc

MungeSumstats:Standardise summary statistics from GWAS

The *MungeSumstats* package is designed to facilitate the standardisation of GWAS summary statistics. It reformats inputted summary statisitics to include SNP, CHR, BP and can look up these values if any are missing. It also pefrorms dozens of QC and filtering steps to ensure high data quality and minimise inter-study differences.

Maintained by Alan Murphy. Last updated 4 months ago.

snp wholegenome genetics comparativegenomics genomewideassociation genomicvariation preprocessing

3 stars 6.23 score 91 scripts

bioc

VariantFiltering:Filtering of coding and non-coding genetic variants

Filter genetic variants using different criteria such as inheritance model, amino acid change consequence, minor allele frequencies across human populations, splice site strength, conservation, etc.

Maintained by Robert Castelo. Last updated 2 months ago.

genetics homo_sapiens annotation snp sequencing highthroughputsequencing

4 stars 6.23 score 21 scripts

bioc

scruff:Single Cell RNA-Seq UMI Filtering Facilitator (scruff)

A pipeline which processes single cell RNA-seq (scRNA-seq) reads from CEL-seq and CEL-seq2 protocols. Demultiplex scRNA-seq FASTQ files, align reads to reference genome using Rsubread, and generate UMI filtered count matrix. Also provide visualizations of read alignments and pre- and post-alignment QC metrics.

Maintained by Zhe Wang. Last updated 5 months ago.

software technology sequencing alignment rnaseq singlecell workflowstep preprocessing qualitycontrol visualization immunooncology bioinformatics scrna-seq single-cell umi

8 stars 6.20 score 22 scripts

bioc

cfDNAPro:cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA

cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.

Maintained by Haichao Wang. Last updated 5 months ago.

visualization sequencing wholegenome bioinformatics cancer-genomics cancer-research cell-free-dna early-detection genomics-visualization liquid-biopsy swgs whole-genome-sequencing

29 stars 6.18 score 13 scripts

bioc

crisprViz:Visualization Functions for CRISPR gRNAs

Provides functionalities to visualize and contextualize CRISPR guide RNAs (gRNAs) on genomic tracks across nucleases and applications. Works in conjunction with the crisprBase and crisprDesign Bioconductor packages. Plots are produced using the Gviz framework.

Maintained by Jean-Philippe Fortin. Last updated 5 months ago.

crispr functionalgenomics genetarget bioconductor bioconductor-package crispr-analysis crispr-design grna grna-sequence grna-sequences sgrna sgrna-design visualization

8 stars 6.16 score 6 scripts 2 dependents

bioc

RCAS:RNA Centric Annotation System

RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.

Maintained by Bora Uyar. Last updated 5 months ago.

software genetarget motifannotation motifdiscovery go transcriptomics genomeannotation genesetenrichment coverage

6.14 score 29 scripts 1 dependents

bioc

CAGEr:Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining

The _CAGEr_ package identifies transcription start sites (TSS) and their usage frequency from CAGE (Cap Analysis Gene Expression) sequencing data. It normalises raw CAGE tag count, clusters TSSs into tag clusters (TC) and aggregates them across multiple CAGE experiments to construct consensus clusters (CC) representing the promoterome. CAGEr provides functions to profile expression levels of these clusters by cumulative expression and rarefaction analysis, and outputs the plots in ggplot2 format for further facetting and customisation. After clustering, CAGEr performs analyses of promoter width and detects differential usage of TSSs (promoter shifting) between samples. CAGEr also exports its data as genome browser tracks, and as R objects for downsteam expression analysis by other Bioconductor packages such as DESeq2, CAGEfightR, or seqArchR.

Maintained by Charles Plessy. Last updated 5 months ago.

preprocessing sequencing normalization functionalgenomics transcription geneexpression clustering visualization

6.12 score 73 scripts

bioc

esATAC:An Easy-to-use Systematic pipeline for ATACseq data analysis

This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw sequencing reads preprocessing (FASTQ files), reads alignment (Rbowtie2), aligned reads file operations (SAM, BAM, and BED files), peak calling (F-seq), genome annotations (Motif, GO, SNP analysis) and quality control report. The package is managed by dataflow graph. It is easy for user to pass variables seamlessly between processes and understand the workflow. Users can process FASTQ files through end-to-end preset pipeline which produces a pretty HTML report for quality control and preliminary statistical results, or customize workflow starting from any intermediate stages with esATAC functions easily and flexibly.

Maintained by Zheng Wei. Last updated 5 months ago.

immunooncology sequencing dnaseq qualitycontrol alignment preprocessing coverage atacseq dnaseseq atac-seq bioconductor pipeline cpp openjdk

23 stars 6.11 score 3 scripts

bioc

BSgenomeForge:Forge your own BSgenome data package

A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure datarepresentation genomeassembly annotation genomeannotation sequencing alignment dataimport sequencematching bioconductor-package core-package

4 stars 6.08 score 6 scripts

bioc

affycoretools:Functions useful for those doing repetitive analyses with Affymetrix GeneChips

Various wrapper functions that have been written to streamline the more common analyses that a core Biostatistician might see.

Maintained by James W. MacDonald. Last updated 5 months ago.

reportwriting microarray onechannel geneexpression

6.07 score 117 scripts

bioc

metaseqR2:An R package for the analysis and result reporting of RNA-Seq data by combining multiple statistical algorithms

Provides an interface to several normalization and statistical testing packages for RNA-Seq gene expression data. Additionally, it creates several diagnostic plots, performs meta-analysis by combinining the results of several statistical tests and reports the results in an interactive way.

Maintained by Panagiotis Moulos. Last updated 21 days ago.

software geneexpression differentialexpression workflowstep preprocessing qualitycontrol normalization reportwriting rnaseq transcription sequencing transcriptomics bayesian clustering cellbiology biomedicalinformatics functionalgenomics systemsbiology immunooncology alternativesplicing differentialsplicing multiplecomparison timecourse dataimport atacseq epigenetics regression proprietaryplatforms genesetenrichment batcheffect chipseq

7 stars 6.05 score 3 scripts

bioc

Rqc:Quality Control Tool for High-Throughput Sequencing Data

Rqc is an optimised tool designed for quality control and assessment of high-throughput sequencing data. It performs parallel processing of entire files and produces a report which contains a set of high-resolution graphics.

Maintained by Welliton Souza. Last updated 5 months ago.

sequencing qualitycontrol dataimport cpp

6.00 score 67 scripts

bioc

EventPointer:An effective identification of alternative splicing events using junction arrays and RNA-Seq data

EventPointer is an R package to identify alternative splicing events that involve either simple (case-control experiment) or complex experimental designs such as time course experiments and studies including paired-samples. The algorithm can be used to analyze data from either junction arrays (Affymetrix Arrays) or sequencing data (RNA-Seq). The software returns a data.frame with the detected alternative splicing events: gene name, type of event (cassette, alternative 3',...,etc), genomic position, statistical significance and increment of the percent spliced in (Delta PSI) for all the events. The algorithm can generate a series of files to visualize the detected alternative splicing events in IGV. This eases the interpretation of results and the design of primers for standard PCR validation.

Maintained by Juan A. Ferrer-Bonsoms. Last updated 5 months ago.

alternativesplicing differentialsplicing mrnamicroarray rnaseq transcription sequencing timecourse immunooncology

4 stars 6.00 score 6 scripts

bioc

biscuiteer:Convenience Functions for Biscuit

A test harness for bsseq loading of Biscuit output, summarization of WGBS data over defined regions and in mappable samples, with or without imputation, dropping of mostly-NA rows, age estimates, etc.

Maintained by Jacob Morrison. Last updated 5 months ago.

dataimport methylseq dnamethylation

6 stars 5.98 score 16 scripts

bioc

kissDE:Retrieves Condition-Specific Variants in RNA-Seq Data

Retrieves condition-specific variants in RNA-seq data (SNVs, alternative-splicings, indels). It has been developed as a post-treatment of 'KisSplice' but can also be used with user's own data.

Maintained by Aurélie Siberchicot. Last updated 5 months ago.

alternativesplicing differentialsplicing experimentaldesign genomicvariation rnaseq transcriptomics

3 stars 5.98 score 7 scripts

bioc

REMP:Repetitive Element Methylation Prediction

Machine learning-based tools to predict DNA methylation of locus-specific repetitive elements (RE) by learning surrounding genetic and epigenetic information. These tools provide genomewide and single-base resolution of DNA methylation prediction on RE that are difficult to measure using array-based or sequencing-based platforms, which enables epigenome-wide association study (EWAS) and differentially methylated region (DMR) analysis on RE.

Maintained by Yinan Zheng. Last updated 5 months ago.

dnamethylation microarray methylationarray sequencing genomewideassociation epigenetics preprocessing multichannel twochannel differentialmethylation qualitycontrol dataimport

2 stars 5.94 score 18 scripts

bioc

SCOPE:A normalization and copy number estimation method for single-cell DNA sequencing

Whole genome single-cell DNA sequencing (scDNA-seq) enables characterization of copy number profiles at the cellular level. This circumvents the averaging effects associated with bulk-tissue sequencing and has increased resolution yet decreased ambiguity in deconvolving cancer subclones and elucidating cancer evolutionary history. ScDNA-seq data is, however, sparse, noisy, and highly variable even within a homogeneous cell population, due to the biases and artifacts that are introduced during the library preparation and sequencing procedure. Here, we propose SCOPE, a normalization and copy number estimation method for scDNA-seq data. The distinguishing features of SCOPE include: (i) utilization of cell-specific Gini coefficients for quality controls and for identification of normal/diploid cells, which are further used as negative control samples in a Poisson latent factor model for normalization; (ii) modeling of GC content bias using an expectation-maximization algorithm embedded in the Poisson generalized linear models, which accounts for the different copy number states along the genome; (iii) a cross-sample iterative segmentation procedure to identify breakpoints that are shared across cells from the same genetic background.

Maintained by Rujin Wang. Last updated 5 months ago.

singlecell normalization copynumbervariation sequencing wholegenome coverage alignment qualitycontrol dataimport dnaseq

5.92 score 84 scripts

bioc

cummeRbund:Analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data.

Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.

Maintained by Loyal A. Goff. Last updated 5 months ago.

highthroughputsequencing highthroughputsequencingdata rnaseq rnaseqdata geneexpression differentialexpression infrastructure dataimport datarepresentation visualization bioinformatics clustering multiplecomparisons qualitycontrol

5.92 score 209 scripts

bioc

BUSpaRse:kallisto | bustools R utilities

The kallisto | bustools pipeline is a fast and modular set of tools to convert single cell RNA-seq reads in fastq files into gene count or transcript compatibility counts (TCC) matrices for downstream analysis. Central to this pipeline is the barcode, UMI, and set (BUS) file format. This package serves the following purposes: First, this package allows users to manipulate BUS format files as data frames in R and then convert them into gene count or TCC matrices. Furthermore, since R and Rcpp code is easier to handle than pure C++ code, users are encouraged to tweak the source code of this package to experiment with new uses of BUS format and different ways to convert the BUS file into gene count matrix. Second, this package can conveniently generate files required to generate gene count matrices for spliced and unspliced transcripts for RNA velocity. Here biotypes can be filtered and scaffolds and haplotypes can be removed, and the filtered transcriptome can be extracted and written to disk. Third, this package implements utility functions to get transcripts and associated genes required to convert BUS files to gene count matrices, to write the transcript to gene information in the format required by bustools, and to read output of bustools into R as sparses matrices.

Maintained by Lambda Moses. Last updated 5 months ago.

singlecell rnaseq workflowstep cpp

9 stars 5.87 score 165 scripts

bioc

APAlyzer:A toolkit for APA analysis using RNA-seq data

Perform 3'UTR APA, Intronic APA and gene expression analysis using RNA-seq data.

Maintained by Ruijia Wang. Last updated 5 months ago.

sequencing rnaseq differentialexpression geneexpression generegulation annotation dataimport software ative-polyadenylation bioinformatics-tool rna-seq

9 stars 5.86 score 9 scripts

bioc

crisprBowtie:Bowtie-based alignment of CRISPR gRNA spacer sequences

Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bowtie. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Both DNA- and RNA-targeting nucleases are supported.

Maintained by Jean-Philippe Fortin. Last updated 5 months ago.

crispr functionalgenomics alignment aligner bioconductor bioconductor-package bowtie crispr-analysis crispr-cas9 crispr-design crispr-target grna grna-sequence grna-sequences sgrna sgrna-design

3 stars 5.86 score 7 scripts 4 dependents

bioc

raer:RNA editing tools in R

Toolkit for identification and statistical testing of RNA editing signals from within R. Provides support for identifying sites from bulk-RNA and single cell RNA-seq datasets, and general methods for extraction of allelic read counts from alignment files. Facilitates annotation and exploratory analysis of editing signals using Bioconductor packages and resources.

Maintained by Kent Riemondy. Last updated 5 months ago.

multiplecomparison rnaseq singlecell sequencing coverage epitranscriptomics featureextraction annotation alignment bioconductor-package rna-seq-analysis single-cell-analysis single-cell-rna-seq curl bzip2 xz-utils zlib

8 stars 5.81 score 6 scripts

bioc

cicero:Predict cis-co-accessibility from single-cell chromatin accessibility data

Cicero computes putative cis-regulatory maps from single-cell chromatin accessibility data. It also extends monocle 2 for use in chromatin accessibility data.

Maintained by Hannah Pliner. Last updated 5 months ago.

sequencing clustering cellbasedassays immunooncology generegulation genetarget epigenetics atacseq singlecell

5.80 score 312 scripts

bioc

GenomicPlot:Plot profiles of next generation sequencing data in genomic features

Visualization of next generation sequencing (NGS) data is essential for interpreting high-throughput genomics experiment results. 'GenomicPlot' facilitates plotting of NGS data in various formats (bam, bed, wig and bigwig); both coverage and enrichment over input can be computed and displayed with respect to genomic features (such as UTR, CDS, enhancer), and user defined genomic loci or regions. Statistical tests on signal intensity within user defined regions of interest can be performed and represented as boxplots or bar graphs. Parallel processing is used to speed up computation on multicore platforms. In addition to genomic plots which is suitable for displaying of coverage of genomic DNA (such as ChIPseq data), metagenomic (without introns) plots can also be made for RNAseq or CLIPseq data as well.

Maintained by Shuye Pu. Last updated 2 months ago.

alternativesplicing chipseq coverage geneexpression rnaseq sequencing software transcription visualization annotation

5 stars 5.78 score 4 scripts

bioc

circRNAprofiler:circRNAprofiler: An R-Based Computational Framework for the Downstream Analysis of Circular RNAs

R-based computational framework for a comprehensive in silico analysis of circRNAs. This computational framework allows to combine and analyze circRNAs previously detected by multiple publicly available annotation-based circRNA detection tools. It covers different aspects of circRNAs analysis from differential expression analysis, evolutionary conservation, biogenesis to functional analysis.

Maintained by Simona Aufiero. Last updated 5 months ago.

annotation structuralprediction functionalprediction geneprediction genomeassembly differentialexpression

10 stars 5.78 score 5 scripts

bioc

TVTB:TVTB: The VCF Tool Box

The package provides S4 classes and methods to filter, summarise and visualise genetic variation data stored in VCF files. In particular, the package extends the FilterRules class (S4Vectors package) to define news classes of filter rules applicable to the various slots of VCF objects. Functionalities are integrated and demonstrated in a Shiny web-application, the Shiny Variant Explorer (tSVE).

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software genetics geneticvariability genomicvariation datarepresentation gui dnaseq wholegenome visualization multiplecomparison dataimport variantannotation sequencing coverage alignment sequencematching

2 stars 5.76 score 16 scripts

bioc

atSNP:Affinity test for identifying regulatory SNPs

atSNP performs affinity tests of motif matches with the SNP or the reference genomes and SNP-led changes in motif matches.

Maintained by Sunyoung Shin. Last updated 5 months ago.

software chipseq genomeannotation motifannotation visualization cpp

1 stars 5.73 score 36 scripts

bioc

Repitools:Epigenomic tools

Tools for the analysis of enrichment-based epigenomic data. Features include summarization and visualization of epigenomic data across promoters according to gene expression context, finding regions of differential methylation/binding, BayMeth for quantifying methylation etc.

Maintained by Mark Robinson. Last updated 5 months ago.

dnamethylation geneexpression methylseq

5.73 score 267 scripts

bioc

DAMEfinder:Finds DAMEs - Differential Allelicly MEthylated regions

'DAMEfinder' offers functionality for taking methtuple or bismark outputs to calculate ASM scores and compute DAMEs. It also offers nice visualization of methyl-circle plots.

Maintained by Stephany Orjuela. Last updated 5 months ago.

dnamethylation differentialmethylation coverage

10 stars 5.70 score 9 scripts

mrcieu

gwasvcf:Tools for Dealing with GWAS Summary Data in VCF Format

Tools for dealing with GWAS summary data in VCF format. Includes reading, querying, writing, as well as helper functions such as LD proxy searches.

Maintained by Gibran Hemani. Last updated 2 years ago.

77 stars 5.65 score 129 scripts 1 dependents

jamesdalg

CNVScope:A Versatile Toolkit for Copy Number Variation Relationship Data Analysis and Visualization

Provides the ability to create interaction maps, discover CNV map domains (edges), gene annotate interactions, and create interactive visualizations of these CNV interaction maps.

Maintained by James Dalgleish. Last updated 3 years ago.

8 stars 5.58 score 24 scripts

bioc

diffHic:Differential Analysis of Hi-C Data

Detects differential interactions across biological conditions in a Hi-C experiment. Methods are provided for read alignment and data pre-processing into interaction counts. Statistical analysis is based on edgeR and supports normalization and filtering. Several visualization options are also available.

Maintained by Aaron Lun. Last updated 4 months ago.

multiplecomparison preprocessing sequencing coverage alignment normalization clustering hic curl bzip2 xz-utils zlib cpp

5.58 score 38 scripts

bioc

HiLDA:Conducting statistical inference on comparing the mutational exposures of mutational signatures by using hierarchical latent Dirichlet allocation

A package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation. It statistically tests whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups. The package also provides inference and visualization.

Maintained by Zhi Yang. Last updated 5 months ago.

software somaticmutation sequencing statisticalmethod bayesian mutational-signatures rjags somatic-mutations cpp jags

3 stars 5.56 score 7 scripts 1 dependents

bioc

multicrispr:Multi-locus multi-purpose Crispr/Cas design

This package is for designing Crispr/Cas9 and Prime Editing experiments. It contains functions to (1) define and transform genomic targets, (2) find spacers (4) count offtarget (mis)matches, and (5) compute Doench2016/2014 targeting efficiency. Care has been taken for multicrispr to scale well towards large target sets, enabling the design of large Crispr/Cas9 libraries.

Maintained by Aditya Bhagwat. Last updated 4 months ago.

crispr software

5.56 score 2 scripts

bioc

demuxSNP:scRNAseq demultiplexing using cell hashing and SNPs

This package assists in demultiplexing scRNAseq data using both cell hashing and SNPs data. The SNP profile of each group os learned using high confidence assignments from the cell hashing data. Cells which cannot be assigned with high confidence from the cell hashing data are assigned to their most similar group based on their SNPs. We also provide some helper function to optimise SNP selection, create training data and merge SNP data into the SingleCellExperiment framework.

Maintained by Michael Lynch. Last updated 5 months ago.

classification singlecell

6 stars 5.52 score 22 scripts

bioc

MotifPeeker:Benchmarking Epigenomic Profiling Methods Using Motif Enrichment

MotifPeeker is used to compare and analyse datasets from epigenomic profiling methods with motif enrichment as the key benchmark. The package outputs an HTML report consisting of three sections: (1. General Metrics) Overview of peaks-related general metrics for the datasets (FRiP scores, peak widths and motif-summit distances). (2. Known Motif Enrichment Analysis) Statistics for the frequency of user-provided motifs enriched in the datasets. (3. De-Novo Motif Enrichment Analysis) Statistics for the frequency of de-novo discovered motifs enriched in the datasets and compared with known motifs.

Maintained by Hiranyamaya Dash. Last updated 3 months ago.

epigenetics genetics qualitycontrol chipseq multiplecomparison functionalgenomics motifdiscovery sequencematching software alignment bioconductor bioconductor-package chip-seq epigenomics interactive-report motif-enrichment-analysis

2 stars 5.48 score 6 scripts

bioc

MethReg:Assessing the regulatory potential of DNA methylation regions or sites on gene transcription

Epigenome-wide association studies (EWAS) detects a large number of DNA methylation differences, often hundreds of differentially methylated regions and thousands of CpGs, that are significantly associated with a disease, many are located in non-coding regions. Therefore, there is a critical need to better understand the functional impact of these CpG methylations and to further prioritize the significant changes. MethReg is an R package for integrative modeling of DNA methylation, target gene expression and transcription factor binding sites data, to systematically identify and rank functional CpG methylations. MethReg evaluates, prioritizes and annotates CpG sites with high regulatory potential using matched methylation and gene expression data, along with external TF-target interaction databases based on manually curation, ChIP-seq experiments or gene regulatory network analysis.

Maintained by Tiago Silva. Last updated 5 months ago.

methylationarray regression geneexpression epigenetics genetarget transcription

5 stars 5.45 score 19 scripts

bioc

ChIPQC:Quality metrics for ChIPseq data

Quality metrics for ChIPseq data.

Maintained by Tom Carroll. Last updated 5 months ago.

sequencing chipseq qualitycontrol reportwriting

5.45 score 140 scripts

bioc

R3CPET:3CPET: Finding Co-factor Complexes in Chia-PET experiment using a Hierarchical Dirichlet Process

The package provides a method to infer the set of proteins that are more probably to work together to maintain chormatin interaction given a ChIA-PET experiment results.

Maintained by Mohamed Nadhir Djekidel. Last updated 5 months ago.

networkinference geneprediction bayesian graphandnetwork network geneexpression hic chia-pet chromatin-interaction dirichlet-process-mixtures transcription-facto cpp

4 stars 5.45 score 5 scripts

bioc

CleanUpRNAseq:Detect and Correct Genomic DNA Contamination in RNA-seq Data

RNA-seq data generated by some library preparation methods, such as rRNA-depletion-based method and the SMART-seq method, might be contaminated by genomic DNA (gDNA), if DNase I disgestion is not performed properly during RNA preparation. CleanUpRNAseq is developed to check if RNA-seq data is suffered from gDNA contamination. If so, it can perform correction for gDNA contamination and reduce false discovery rate of differentially expressed genes.

Maintained by Haibo Liu. Last updated 4 months ago.

qualitycontrol sequencing geneexpression

5 stars 5.44 score 4 scripts

steverozen

ICAMS:In-Depth Characterization and Analysis of Mutational Signatures ('ICAMS')

Analysis and visualization of experimentally elucidated mutational signatures -- the kind of analysis and visualization in Boot et al., "In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors", Genome Research 2018, <doi:10.1101/gr.230219.117> and "Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types", Genome Research 2020 <doi:10.1101/gr.255620.119>. 'ICAMS' stands for In-depth Characterization and Analysis of Mutational Signatures. 'ICAMS' has functions to read in variant call files (VCFs) and to collate the corresponding catalogs of mutational spectra and to analyze and plot catalogs of mutational spectra and signatures. Handles both "counts-based" and "density-based" (i.e. representation as mutations per megabase) mutational spectra or signatures.

Maintained by Steve Rozen. Last updated 3 years ago.

8 stars 5.41 score 128 scripts

bioc

UMI4Cats:UMI4Cats: Processing, analysis and visualization of UMI-4C chromatin contact data

UMI-4C is a technique that allows characterization of 3D chromatin interactions with a bait of interest, taking advantage of a sonication step to produce unique molecular identifiers (UMIs) that help remove duplication bias, thus allowing a better differential comparsion of chromatin interactions between conditions. This package allows processing of UMI-4C data, starting from FastQ files provided by the sequencing facility. It provides two statistical methods for detecting differential contacts and includes a visualization function to plot integrated information from a UMI-4C assay.

Maintained by Mireia Ramos-Rodriguez. Last updated 5 months ago.

qualitycontrol preprocessing alignment normalization visualization sequencing coverage chromatin chromatin-interaction genomics umi4c

5 stars 5.40 score 7 scripts

bioc

TAPseq:Targeted scRNA-seq primer design for TAP-seq

Design primers for targeted single-cell RNA-seq used by TAP-seq. Create sequence templates for target gene panels and design gene-specific primers using Primer3. Potential off-targets can be estimated with BLAST. Requires working installations of Primer3 and BLASTn.

Maintained by Andreas R. Gschwind. Last updated 5 months ago.

singlecell sequencing technology crispr pooledscreens

4 stars 5.38 score 9 scripts

bioc

VanillaICE:A Hidden Markov Model for high throughput genotyping arrays

Hidden Markov Models for characterizing chromosomal alteration in high throughput SNP arrays.

Maintained by Robert Scharpf. Last updated 5 months ago.

copynumbervariation

5.36 score 63 scripts 1 dependents

bioc

svaNUMT:NUMT detection from structural variant calls

svaNUMT contains functions for detecting NUMT events from structural variant calls. It takes structural variant calls in GRanges of breakend notation and identifies NUMTs by nuclear-mitochondrial breakend junctions. The main function reports candidate NUMTs if there is a pair of valid insertion sites found on the nuclear genome within a certain distance threshold. The candidate NUMTs are reported by events.

Maintained by Ruining Dong. Last updated 5 months ago.

dataimport sequencing annotation genetics variantannotation

3 stars 5.35 score 6 scripts

bioc

ProteoDisco:Generation of customized protein variant databases from genomic variants, splice-junctions and manual sequences

ProteoDisco is an R package to facilitate proteogenomics studies. It houses functions to create customized (variant) protein databases based on user-submitted genomic variants, splice-junctions, fusion genes and manual transcript sequences. The flexible workflow can be adopted to suit a myriad of research and experimental settings.

Maintained by Job van Riet. Last updated 5 months ago.

software proteomics rnaseq snp sequencing variantannotation dataimport

5 stars 5.30 score 4 scripts

bioc

RnaSeqSampleSize:RnaSeqSampleSize

RnaSeqSampleSize package provides a sample size calculation method based on negative binomial model and the exact test for assessing differential expression analysis of RNA-seq data. It controls FDR for multiple testing and utilizes the average read count and dispersion distributions from real data to estimate a more reliable sample size. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.

Maintained by Shilin Zhao Developer. Last updated 5 months ago.

immunooncology experimentaldesign sequencing rnaseq geneexpression differentialexpression cpp

5.30 score 20 scripts

bioc

CODEX:A Normalization and Copy Number Variation Detection Method for Whole Exome Sequencing

A normalization and copy number variation calling procedure for whole exome DNA sequencing data. CODEX relies on the availability of multiple samples processed using the same sequencing pipeline for normalization, and does not require matched controls. The normalization model in CODEX includes terms that specifically remove biases due to GC content, exon length and targeting and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data.

Maintained by Yuchao Jiang. Last updated 5 months ago.

immunooncology exomeseq normalization qualitycontrol copynumbervariation

5.30 score 33 scripts 1 dependents

bioc

DeMixT:Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms

DeMixT is a software package that performs deconvolution on transcriptome data from a mixture of two or three components.

Maintained by Ruonan Li. Last updated 5 months ago.

software statisticalmethod classification geneexpression sequencing microarray tissuemicroarray coverage cpp openmp

5.27 score 25 scripts

bioc

periodicDNA:Set of tools to identify periodic occurrences of k-mers in DNA sequences

This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). The functions of this package provide a straightforward approach to find periodic occurrences of k-mers in DNA sequences, such as regulatory elements. It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.

Maintained by Jacques Serizay. Last updated 5 months ago.

sequencematching motifdiscovery motifannotation sequencing coverage alignment dataimport

6 stars 5.26 score 5 scripts

bioc

ASpli:Analysis of Alternative Splicing Using RNA-Seq

Integrative pipeline for the analysis of alternative splicing using RNAseq.

Maintained by Ariel Chernomoretz. Last updated 5 months ago.

immunooncology geneexpression transcription alternativesplicing coverage differentialexpression differentialsplicing timecourse rnaseq genomeannotation sequencing alignment

5.21 score 45 scripts 1 dependents

bioc

Damsel:Damsel: an end to end analysis of DamID

Damsel provides an end to end analysis of DamID data. Damsel takes bam files from Dam-only control and fusion samples and counts the reads matching to each GATC region. edgeR is utilised to identify regions of enrichment in the fusion relative to the control. Enriched regions are combined into peaks, and are associated with nearby genes. Damsel allows for IGV style plots to be built as the results build, inspired by ggcoverage, and using the functionality and layering ability of ggplot2. Damsel also conducts gene ontology testing with bias correction through goseq, and future versions of Damsel will also incorporate motif enrichment analysis. Overall, Damsel is the first package allowing for an end to end analysis with visual capabilities. The goal of Damsel was to bring all the analysis into one place, and allow for exploratory analysis within R.

Maintained by Caitlin Page. Last updated 5 months ago.

differentialmethylation peakdetection geneprediction genesetenrichment

5.20 score 20 scripts

bioc

regutools:regutools: an R package for data extraction from RegulonDB

RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.

Maintained by Joselyn Chavez. Last updated 4 months ago.

generegulation geneexpression systemsbiology network networkinference visualization transcription bioconductor cdsb regulondb

4 stars 5.20 score 6 scripts

bioc

RNAAgeCalc:A multi-tissue transcriptional age calculator

It has been shown that both DNA methylation and RNA transcription are linked to chronological age and age related diseases. Several estimators have been developed to predict human aging from DNA level and RNA level. Most of the human transcriptional age predictor are based on microarray data and limited to only a few tissues. To date, transcriptional studies on aging using RNASeq data from different human tissues is limited. The aim of this package is to provide a tool for across-tissue and tissue-specific transcriptional age calculation based on GTEx RNASeq data.

Maintained by Xu Ren. Last updated 5 months ago.

rnaseq geneexpression biological-age elastic-net gene-expression genotype-tissue-expression prediction regularized-regression rna-seq

8 stars 5.20 score 10 scripts

bioc

methylCC:Estimate the cell composition of whole blood in DNA methylation samples

A tool to estimate the cell composition of DNA methylation whole blood sample measured on any platform technology (microarray and sequencing).

Maintained by Stephanie C. Hicks. Last updated 5 months ago.

microarray sequencing dnamethylation methylationarray methylseq wholegenome

19 stars 5.18 score 8 scripts

bioc

CNVfilteR:Identifies false positives of CNV calling tools by using SNV calls

CNVfilteR identifies those CNVs that can be discarded by using the single nucleotide variant (SNV) calls that are usually obtained in common NGS pipelines.

Maintained by Jose Marcos Moreno-Cabrera. Last updated 5 months ago.

copynumbervariation sequencing dnaseq visualization dataimport

5 stars 5.18 score 1 scripts

bioc

MEDIPS:DNA IP-seq data analysis

MEDIPS was developed for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, MEDIPS provides functionalities for the analysis of any kind of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential coverage between groups of samples and saturation and correlation analysis.

Maintained by Lukas Chavez. Last updated 5 months ago.

dnamethylation cpgisland differentialexpression sequencing chipseq preprocessing qualitycontrol visualization microarray genetics coverage genomeannotation copynumbervariation sequencematching

5.17 score 74 scripts

bioc

DuplexDiscovereR:Analysis of the data from RNA duplex probing experiments

DuplexDiscovereR is a package designed for analyzing data from RNA cross-linking and proximity ligation protocols such as SPLASH, PARIS, LIGR-seq, and others. DuplexDiscovereR accepts input in the form of chimerically or split-aligned reads. It includes procedures for alignment classification, filtering, and efficient clustering of individual chimeric reads into duplex groups (DGs). Once DGs are identified, the package predicts RNA duplex formation and their hybridization energies. Additional metrics, such as p-values for random ligation hypothesis or mean DG alignment scores, can be calculated to rank final set of RNA duplexes. Data from multiple experiments or replicates can be processed separately and further compared to check the reproducibility of the experimental method.

Maintained by Egor Semenchenko. Last updated 9 days ago.

sequencing transcriptomics structuralprediction clustering splicedalignment

2 stars 5.15 score 5 scripts

bioc

lineagespot:Detection of SARS-CoV-2 lineages in wastewater samples using next-generation sequencing

Lineagespot is a framework written in R, and aims to identify SARS-CoV-2 related mutations based on a single (or a list) of variant(s) file(s) (i.e., variant calling format). The method can facilitate the detection of SARS-CoV-2 lineages in wastewater samples using next generation sequencing, and attempts to infer the potential distribution of the SARS-CoV-2 lineages.

Maintained by Nikolaos Pechlivanis. Last updated 5 months ago.

variantdetection variantannotation sequencing

2 stars 5.15 score 4 scripts

bioc

crisprVerse:Easily install and load the crisprVerse ecosystem for CRISPR gRNA design

The crisprVerse is a modular ecosystem of R packages developed for the design and manipulation of CRISPR guide RNAs (gRNAs). All packages share a common language and design principles. This package is designed to make it easy to install and load the crisprVerse packages in a single step. To learn more about the crisprVerse, visit <https://www.github.com/crisprVerse>.

Maintained by Jean-Philippe Fortin. Last updated 5 months ago.

crispr functionalgenomics genetarget crispr-analysis crispr-design crispr-target grna grna-sequence grna-sequences

13 stars 5.11 score 8 scripts

bioc

geomeTriD:A R/Bioconductor package for interactive 3D plot of epigenetic data or single cell data

geomeTriD (Three Dimensional Geometry Package) create interactive 3D plots using the GL library with the 'three.js' visualization library (https://threejs.org) or the rgl library. In addition to creating interactive 3D plots, the application also generates simplified models in 2D. These 2D models provide a more straightforward visual representation, making it easier to analyze and interpret the data quickly. This functionality ensures that users have access to both detailed three-dimensional visualizations and more accessible two-dimensional views, catering to various analytical needs.

Maintained by Jianhong Ou. Last updated 2 months ago.

visualization

1 stars 5.10 score 7 scripts

bioc

icetea:Integrating Cap Enrichment with Transcript Expression Analysis

icetea (Integrating Cap Enrichment with Transcript Expression Analysis) provides functions for end-to-end analysis of multiple 5'-profiling methods such as CAGE, RAMPAGE and MAPCap, beginning from raw reads to detection of transcription start sites using replicates. It also allows performing differential TSS detection between group of samples, therefore, integrating the mRNA cap enrichment information with transcript expression analysis.

Maintained by Vivek Bhardwaj. Last updated 5 months ago.

immunooncology transcription geneexpression sequencing rnaseq transcriptomics differentialexpression cage expression rna-seq

2 stars 5.08 score 7 scripts

bioc

AllelicImbalance:Investigates Allele Specific Expression

Provides a framework for allelic specific expression investigation using RNA-seq data.

Maintained by Jesper R Gadin. Last updated 5 months ago.

genetics infrastructure sequencing

5.08 score 7 scripts

bioc

podkat:Position-Dependent Kernel Association Test

This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.

Maintained by Ulrich Bodenhofer. Last updated 5 months ago.

genetics wholegenome annotation variantannotation sequencing dataimport curl bzip2 xz-utils zlib cpp

5.02 score 6 scripts

bioc

katdetectr:Detection, Characterization and Visualization of Kataegis in Sequencing Data

Kataegis refers to the occurrence of regional hypermutation and is a phenomenon observed in a wide range of malignancies. Using changepoint detection katdetectr aims to identify putative kataegis foci from common data-formats housing genomic variants. Katdetectr has shown to be a robust package for the detection, characterization and visualization of kataegis.

Maintained by Daan Hazelaar. Last updated 5 months ago.

wholegenome software snp sequencing classification variantannotation

5 stars 5.00 score 4 scripts

bioc

derfinderPlot:Plotting functions for derfinder

This package provides plotting functions for results from the derfinder package. This helps separate the graphical dependencies required for making these plots from the core functionality of derfinder.

Maintained by Leonardo Collado-Torres. Last updated 4 months ago.

differentialexpression sequencing rnaseq software visualization immunooncology bioconductor derfinder

2 stars 5.00 score 5 scripts

bioc

gcapc:GC Aware Peak Caller

Peak calling for ChIP-seq data with consideration of potential GC bias in sequencing reads. GC bias is first estimated with generalized linear mixture models using effective GC strategy, then applied into peak significance estimation.

Maintained by Mingxiang Teng. Last updated 5 months ago.

sequencing chipseq batcheffect peakdetection

9 stars 4.95 score 7 scripts

bioc

tadar:Transcriptome Analysis of Differential Allelic Representation

This package provides functions to standardise the analysis of Differential Allelic Representation (DAR). DAR compromises the integrity of Differential Expression analysis results as it can bias expression, influencing the classification of genes (or transcripts) as being differentially expressed. DAR analysis results in an easy-to-interpret value between 0 and 1 for each genetic feature of interest, where 0 represents identical allelic representation and 1 represents complete diversity. This metric can be used to identify features prone to false-positive calls in Differential Expression analysis, and can be leveraged with statistical methods to alleviate the impact of such artefacts on RNA-seq data.

Maintained by Lachlan Baer. Last updated 2 months ago.

sequencing rnaseq snp genomicvariation variantannotation differentialexpression

1 stars 4.95 score 4 scripts

bioc

tRNAscanImport:Importing a tRNAscan-SE result file as GRanges object

The package imports the result of tRNAscan-SE as a GRanges object.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

software dataimport workflowstep preprocessing visualization bioconductor sequences structures trna trnascan trnascan-se

2 stars 4.95 score 3 scripts

bioc

rGADEM:de novo motif discovery

rGADEM is an efficient de novo motif discovery tool for large-scale genomic sequence data. It is an open-source R package, which is based on the GADEM software.

Maintained by Arnaud Droit. Last updated 5 months ago.

microarray chipchip sequencing chipseq motifdiscovery openmp

4.95 score 56 scripts

bioc

epigraHMM:Epigenomic R-based analysis with hidden Markov models

epigraHMM provides a set of tools for the analysis of epigenomic data based on hidden Markov Models. It contains two separate peak callers, one for consensus peaks from biological or technical replicates, and one for differential peaks from multi-replicate multi-condition experiments. In differential peak calling, epigraHMM provides window-specific posterior probabilities associated with every possible combinatorial pattern of read enrichment across conditions.

Maintained by Pedro Baldoni. Last updated 5 months ago.

chipseq atacseq dnaseseq hiddenmarkovmodel epigenetics zlib openblas cpp openmp

4.94 score 88 scripts

bioc

GreyListChIP:Grey Lists -- Mask Artefact Regions Based on ChIP Inputs

Identify regions of ChIP experiments with high signal in the input, that lead to spurious peaks during peak calling. Remove reads aligning to these regions prior to peak calling, for cleaner ChIP analysis.

Maintained by Matt Eldridge. Last updated 5 months ago.

chipseq alignment preprocessing differentialpeakcalling sequencing genomeannotation coverage

4.93 score 10 scripts 4 dependents

bioc

CNVrd2:CNVrd2: a read depth-based method to detect and genotype complex common copy number variants from next generation sequencing data.

CNVrd2 uses next-generation sequencing data to measure human gene copy number for multiple samples, indentify SNPs tagging copy number variants and detect copy number polymorphic genomic regions.

Maintained by Hoang Tan Nguyen. Last updated 5 months ago.

copynumbervariation snp sequencing software coverage linkagedisequilibrium clustering.jags cpp

3 stars 4.92 score

bioc

myvariant:Accesses MyVariant.info variant query and annotation services

MyVariant.info is a comprehensive aggregation of variant annotation resources. myvariant is a wrapper for querying MyVariant.info services

Maintained by Adam Mark, Chunlei Wu. Last updated 5 months ago.

variantannotation annotation genomicvariation

4.92 score 28 scripts

bioc

spiky:Spike-in calibration for cell-free MeDIP

spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.

Maintained by Tim Triche. Last updated 5 months ago.

differentialmethylation dnamethylation normalization preprocessing qualitycontrol sequencing

2 stars 4.90 score 3 scripts

bioc

gDNAx:Diagnostics for assessing genomic DNA contamination in RNA-seq data

Provides diagnostics for assessing genomic DNA contamination in RNA-seq data, as well as plots representing these diagnostics. Moreover, the package can be used to get an insight into the strand library protocol used and, in case of strand-specific libraries, the strandedness of the data. Furthermore, it provides functionality to filter out reads of potential gDNA origin.

Maintained by Robert Castelo. Last updated 2 months ago.

transcription transcriptomics rnaseq sequencing preprocessing software geneexpression coverage differentialexpression functionalgenomics splicedalignment alignment

1 stars 4.90 score 3 scripts

bioc

ribosomeProfilingQC:Ribosome Profiling Quality Control

Ribo-Seq (also named ribosome profiling or footprinting) measures translatome (unlike RNA-Seq, which sequences the transcriptome) by direct quantification of the ribosome-protected fragments (RPFs). This package provides the tools for quality assessment of ribosome profiling. In addition, it can preprocess Ribo-Seq data for subsequent differential analysis.

Maintained by Jianhong Ou. Last updated 2 months ago.

riboseq sequencing generegulation qualitycontrol visualization coverage

4.88 score 17 scripts

bioc

MethylSeekR:Segmentation of Bis-seq data

This is a package for the discovery of regulatory regions from Bis-seq data

Maintained by Lukas Burger. Last updated 5 months ago.

sequencing methylseq dnamethylation

4.83 score 34 scripts

bioc

TFutils:TFutils

This package helps users to work with TF metadata from various sources. Significant catalogs of TFs and classifications thereof are made available. Tools for working with motif scans are also provided.

Maintained by Vincent Carey. Last updated 5 months ago.

transcriptomics

4.80 score 21 scripts

mrcieu

gwasglue:GWAS summary data sources connected to analytical tools

Many tools exist that use GWAS summary data for colocalisation, fine mapping, Mendelian randomization, visualisation, etc. This package is a conduit that connects R packages that can retrieve GWAS summary data to various tools for analysing those data.

Maintained by Gibran Hemani. Last updated 3 years ago.

135 stars 4.79 score 91 scripts

bioc

decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting

Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.

Maintained by Rosario M. Piro. Last updated 5 months ago.

software snp sequencing dnaseq genomicvariation somaticmutation biomedicalinformatics genetics biologicalquestion statisticalmethod

1 stars 4.78 score 10 scripts 1 dependents

bioc

transmogR:Modify a set of reference sequences using a set of variants

transmogR provides the tools needed to crate a new reference genome or reference transcriptome, using a set of variants. Variants can be any combination of SNPs, Insertions and Deletions. The intended use-case is to enable creation of variant-modified reference transcriptomes for incorporation into transcriptomic pseudo-alignment workflows, such as salmon.

Maintained by Stevie Pederson. Last updated 17 days ago.

alignment genomicvariation sequencing transcriptomevariant variantannotation zlib

4.74 score 2 scripts

bioc

methylPipe:Base resolution DNA methylation data analysis

Memory efficient analysis of base resolution DNA methylation data in both the CpG and non-CpG sequence context. Integration of DNA methylation data derived from any methodology providing base- or low-resolution data.

Maintained by Mattia Furlan. Last updated 5 months ago.

methylseq dnamethylation coverage sequencing

4.73 score 1 scripts 1 dependents

bioc

scmeth:Functions to conduct quality control analysis in methylation data

Functions to analyze methylation data can be found here. Some functions are relevant for single cell methylation data but most other functions can be used for any methylation data. Highlight of this workflow is the comprehensive quality control report.

Maintained by Divy Kangeyan. Last updated 5 months ago.

dnamethylation qualitycontrol preprocessing singlecell immunooncology bioconductor-package methylation single-cell-methylation

4.70 score 5 scripts

bioc

hiAnnotator:Functions for annotating GRanges objects

hiAnnotator contains set of functions which allow users to annotate a GRanges object with custom set of annotations. The basic philosophy of this package is to take two GRanges objects (query & subject) with common set of seqnames (i.e. chromosomes) and return associated annotation per seqnames and rows from the query matching seqnames and rows from the subject (i.e. genes or cpg islands). The package comes with three types of annotation functions which calculates if a position from query is: within a feature, near a feature, or count features in defined window sizes. Moreover, each function is equipped with parallel backend to utilize the foreach package. In addition, the package is equipped with wrapper functions, which finds appropriate columns needed to make a GRanges object from a common data frame.

Maintained by Nirav V Malani. Last updated 5 months ago.

software annotation

4.65 score 15 scripts 1 dependents

bioc

alabaster.vcf:Save and Load Variant Data to/from File

Save variant calling SummarizedExperiment to file and load them back as VCF objects. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

4.65 score 6 scripts 1 dependents

bioc

gmapR:An R interface to the GMAP/GSNAP/GSTRUCT suite

GSNAP and GMAP are a pair of tools to align short-read data written by Tom Wu. This package provides convenience methods to work with GMAP and GSNAP from within R. In addition, it provides methods to tally alignment results on a per-nucleotide basis using the bam_tally tool.

Maintained by Michael Lawrence. Last updated 14 days ago.

alignment zlib

4.65 score 45 scripts

bioc

methodical:Discovering genomic regions where methylation is strongly associated with transcriptional activity

DNA methylation is generally considered to be associated with transcriptional silencing. However, comprehensive, genome-wide investigation of this relationship requires the evaluation of potentially millions of correlation values between the methylation of individual genomic loci and expression of associated transcripts in a relatively large numbers of samples. Methodical makes this process quick and easy while keeping a low memory footprint. It also provides a novel method for identifying regions where a number of methylation sites are consistently strongly associated with transcriptional expression. In addition, Methodical enables housing DNA methylation data from diverse sources (e.g. WGBS, RRBS and methylation arrays) with a common framework, lifting over DNA methylation data between different genome builds and creating base-resolution plots of the association between DNA methylation and transcriptional activity at transcriptional start sites.

Maintained by Richard Heery. Last updated 2 months ago.

dnamethylation methylationarray transcription genomewideassociation software openjdk

4.65 score 14 scripts

bioc

branchpointer:Prediction of intronic splicing branchpoints

Predicts branchpoint probability for sites in intronic branchpoint windows. Queries can be supplied as intronic regions; or to evaluate the effects of mutations, SNPs.

Maintained by Beth Signal. Last updated 5 months ago.

software genomeannotation genomicvariation motifannotation

4.62 score 21 scripts

bioc

GA4GHshiny:Shiny application for interacting with GA4GH-based data servers

GA4GHshiny package provides an easy way to interact with data servers based on Global Alliance for Genomics and Health (GA4GH) genomics API through a Shiny application. It also integrates with Beacon Network.

Maintained by Welliton Souza. Last updated 5 months ago.

gui

2 stars 4.60 score 3 scripts

bioc

svaRetro:Retrotransposed transcript detection from structural variants

svaRetro contains functions for detecting retrotransposed transcripts (RTs) from structural variant calls. It takes structural variant calls in GRanges of breakend notation and identifies RTs by exon-exon junctions and insertion sites. The candidate RTs are reported by events and annotated with information of the inserted transcripts.

Maintained by Ruining Dong. Last updated 5 months ago.

dataimport sequencing annotation genetics variantannotation coverage variantdetection

4.60 score 4 scripts

bioc

OGRE:Calculate, visualize and analyse overlap between genomic regions

OGRE calculates overlap between user defined genomic region datasets. Any regions can be supplied i.e. genes, SNPs, or reads from sequencing experiments. Key numbers help analyse the extend of overlaps which can also be visualized at a genomic level.

Maintained by Sven Berres. Last updated 5 months ago.

software workflowstep biologicalquestion annotation metagenomics visualization sequencing

2 stars 4.60 score 4 scripts

bioc

RESOLVE:RESOLVE: An R package for the efficient analysis of mutational signatures from cancer genomes

Cancer is a genetic disease caused by somatic mutations in genes controlling key biological functions such as cellular growth and division. Such mutations may arise both through cell-intrinsic and exogenous processes, generating characteristic mutational patterns over the genome named mutational signatures. The study of mutational signatures have become a standard component of modern genomics studies, since it can reveal which (environmental and endogenous) mutagenic processes are active in a tumor, and may highlight markers for therapeutic response. Mutational signatures computational analysis presents many pitfalls. First, the task of determining the number of signatures is very complex and depends on heuristics. Second, several signatures have no clear etiology, casting doubt on them being computational artifacts rather than due to mutagenic processes. Last, approaches for signatures assignment are greatly influenced by the set of signatures used for the analysis. To overcome these limitations, we developed RESOLVE (Robust EStimation Of mutationaL signatures Via rEgularization), a framework that allows the efficient extraction and assignment of mutational signatures. RESOLVE implements a novel algorithm that enables (i) the efficient extraction, (ii) exposure estimation, and (iii) confidence assessment during the computational inference of mutational signatures.

Maintained by Luca De Sano. Last updated 8 days ago.

biomedicalinformatics somaticmutation

1 stars 4.60 score 3 scripts

bioc

UPDhmm:Detecting Uniparental Disomy through NGS trio data

Uniparental disomy (UPD) is a genetic condition where an individual inherits both copies of a chromosome or part of it from one parent, rather than one copy from each parent. This package contains a HMM for detecting UPDs through HTS (High Throughput Sequencing) data from trio assays. By analyzing the genotypes in the trio, the model infers a hidden state (normal, father isodisomy, mother isodisomy, father heterodisomy and mother heterodisomy).

Maintained by Marta Sevilla. Last updated 5 months ago.

software hiddenmarkovmodel genetics

1 stars 4.54 score 3 scripts

bioc

customProDB:Generate customized protein database from NGS data, with a focus on RNA-Seq data, for proteomics search

Database search is the most widely used approach for peptide and protein identification in mass spectrometry-based proteomics studies. Our previous study showed that sample-specific protein databases derived from RNA-Seq data can better approximate the real protein pools in the samples and thus improve protein identification. More importantly, single nucleotide variations, short insertion and deletions and novel junctions identified from RNA-Seq data make protein database more complete and sample-specific. Here, we report an R package customProDB that enables the easy generation of customized databases from RNA-Seq data for proteomics search. This work bridges genomics and proteomics studies and facilitates cross-omics data integration.

Maintained by Xiaojing Wang. Last updated 5 months ago.

immunooncology sequencing massspectrometry proteomics snp rnaseq software transcription alternativesplicing functionalgenomics

4.50 score 15 scripts

bioc

ChIPComp:Quantitative comparison of multiple ChIP-seq datasets

ChIPComp detects differentially bound sharp binding sites across multiple conditions considering matching control.

Maintained by Li Chen. Last updated 5 months ago.

chipseq sequencing transcription genetics coverage multiplecomparison dataimport

4.49 score 51 scripts

bioc

primirTSS:Prediction of pri-miRNA Transcription Start Site

A fast, convenient tool to identify the TSSs of miRNAs by integrating the data of H3K4me3 and Pol II as well as combining the conservation level and sequence feature, provided within both command-line and graphical interfaces, which achieves a better performance than the previous non-cell-specific methods on miRNA TSSs.

Maintained by Pumin Li. Last updated 5 months ago.

immunooncology sequencing rnaseq genetics preprocessing transcription generegulation

4.48 score 2 scripts

bioc

mitoClone2:Clonal Population Identification in Single-Cell RNA-Seq Data using Mitochondrial and Somatic Mutations

This package primarily identifies variants in mitochondrial genomes from BAM alignment files. It filters these variants to remove RNA editing events then estimates their evolutionary relationship (i.e. their phylogenetic tree) and groups single cells into clones. It also visualizes the mutations and providing additional genomic context.

Maintained by Benjamin Story. Last updated 5 months ago.

annotation dataimport genetics snp software singlecell alignment curl bzip2 xz-utils zlib cpp

1 stars 4.48 score 9 scripts

bioc

intansv:Integrative analysis of structural variations

This package provides efficient tools to read and integrate structural variations predicted by popular softwares. Annotation and visulation of structural variations are also implemented in the package.

Maintained by Wen Yao. Last updated 5 months ago.

genetics annotation sequencing software

4.48 score 2 scripts

bioc

tLOH:Assessment of evidence for LOH in spatial transcriptomics pre-processed data using Bayes factor calculations

tLOH, or transcriptomicsLOH, assesses evidence for loss of heterozygosity (LOH) in pre-processed spatial transcriptomics data. This tool requires spatial transcriptomics cluster and allele count information at likely heterozygous single-nucleotide polymorphism (SNP) positions in VCF format. Bayes factors are calculated at each SNP to determine likelihood of potential loss of heterozygosity event. Two plotting functions are included to visualize allele fraction and aggregated Bayes factor per chromosome. Data generated with the 10X Genomics Visium Spatial Gene Expression platform must be pre-processed to obtain an individual sample VCF with columns for each cluster. Required fields are allele depth (AD) with counts for reference/alternative alleles and read depth (DP).

Maintained by Michelle Webb. Last updated 5 months ago.

copynumbervariation transcription snp geneexpression transcriptomics

3 stars 4.48 score 4 scripts

bioc

GA4GHclient:A Bioconductor package for accessing GA4GH API data servers

GA4GHclient provides an easy way to access public data servers through Global Alliance for Genomics and Health (GA4GH) genomics API. It provides low-level access to GA4GH API and translates response data into Bioconductor-based class objects.

Maintained by Welliton Souza. Last updated 5 months ago.

datarepresentation thirdpartyclient

1 stars 4.48 score 3 scripts 1 dependents

bioc

uncoverappLib:Interactive graphical application for clinical assessment of sequence coverage at the base-pair level

a Shiny application containing a suite of graphical and statistical tools to support clinical assessment of low coverage regions.It displays three web pages each providing a different analysis module: Coverage analysis, calculate AF by allele frequency app and binomial distribution. uncoverAPP provides a statisticl summary of coverage given target file or genes name.

Maintained by Emanuela Iovino. Last updated 5 months ago.

software visualization annotation coverage

3 stars 4.48 score 4 scripts

bioc

msgbsR:msgbsR: methylation sensitive genotyping by sequencing (MS-GBS) R functions

Pipeline for the anaysis of a MS-GBS experiment.

Maintained by Benjamin Mayne. Last updated 5 months ago.

immunooncology differentialmethylation dataimport epigenetics methylseq

4.48 score 1 scripts

bioc

comapr:Crossover analysis and genetic map construction

comapr detects crossover intervals for single gametes from their haplotype states sequences and stores the crossovers in GRanges object. The genetic distances can then be calculated via the mapping functions using estimated crossover rates for maker intervals. Visualisation functions for plotting interval-based genetic map or cumulative genetic distances are implemented, which help reveal the variation of crossovers landscapes across the genome and across individuals.

Maintained by Ruqian Lyu. Last updated 3 months ago.

software singlecell visualization genetics

4.48 score 4 scripts

bioc

ORFhunteR:Predict open reading frames in nucleotide sequences

The ORFhunteR package is a R and C++ library for an automatic determination and annotation of open reading frames (ORF) in a large set of RNA molecules. It efficiently implements the machine learning model based on vectorization of nucleotide sequences and the random forest classification algorithm. The ORFhunteR package consists of a set of functions written in the R language in conjunction with C++. The efficiency of the package was confirmed by the examples of the analysis of RNA molecules from the NCBI RefSeq and Ensembl databases. The package can be used in basic and applied biomedical research related to the study of the transcriptome of normal as well as altered (for example, cancer) human cells.

Maintained by Vasily V. Grinev. Last updated 5 months ago.

technology statisticalmethod sequencing rnaseq classification featureextraction cpp

1 stars 4.48 score

bioc

dagLogo:dagLogo: a Bioconductor package for visualizing conserved amino acid sequence pattern in groups based on probability theory

Visualize significant conserved amino acid sequence pattern in groups based on probability theory.

Maintained by Jianhong Ou. Last updated 3 months ago.

sequencematching visualization

4.48 score 9 scripts

bioc

crisprShiny:Exploring curated CRISPR gRNAs via Shiny

Provides means to interactively visualize guide RNAs (gRNAs) in GuideSet objects via Shiny application. This GUI can be self-contained or as a module within a larger Shiny app. The content of the app reflects the annotations present in the passed GuideSet object, and includes intuitive tools to examine, filter, and export gRNAs, thereby making gRNA design more user-friendly.

Maintained by Jean-Philippe Fortin. Last updated 5 months ago.

crispr functionalgenomics genetarget gui crispr-analysis crispr-design shiny

2 stars 4.48 score 8 scripts

bioc

Pviz:Peptide Annotation and Data Visualization using Gviz

Pviz adapts the Gviz package for protein sequences and data.

Maintained by Renan Sauteraud. Last updated 5 months ago.

visualization proteomics microarray

4.48 score 4 scripts

bioc

ChIPanalyser:ChIPanalyser: Predicting Transcription Factor Binding Sites

ChIPanalyser is a package to predict and understand TF binding by utilizing a statistical thermodynamic model. The model incorporates 4 main factors thought to drive TF binding: Chromatin State, Binding energy, Number of bound molecules and a scaling factor modulating TF binding affinity. Taken together, ChIPanalyser produces ChIP-like profiles that closely mimic the patterns seens in real ChIP-seq data.

Maintained by Patrick C.N. Martin. Last updated 5 months ago.

software biologicalquestion workflowstep transcription sequencing chiponchip coverage alignment chipseq sequencematching dataimport peakdetection

4.38 score 12 scripts

bioc

scoreInvHap:Get inversion status in predefined regions

scoreInvHap can get the samples' inversion status of known inversions. scoreInvHap uses SNP data as input and requires the following information about the inversion: genotype frequencies in the different haplotypes, R2 between the region SNPs and inversion status and heterozygote genotypes in the reference. The package include this data for 21 inversions.

Maintained by Dolors Pelegri-Siso. Last updated 5 months ago.

snp genetics genomicvariation

4.34 score 11 scripts

bioc

GeneStructureTools:Tools for spliced gene structure manipulation and analysis

GeneStructureTools can be used to create in silico alternative splicing events, and analyse potential effects this has on functional gene products.

Maintained by Beth Signal. Last updated 5 months ago.

immunooncology software differentialsplicing functionalprediction transcriptomics alternativesplicing rnaseq

4.32 score 21 scripts

bioc

SPLINTER:Splice Interpreter of Transcripts

Provides tools to analyze alternative splicing sites, interpret outcomes based on sequence information, select and design primers for site validiation and give visual representation of the event to guide downstream experiments.

Maintained by Diana Low. Last updated 5 months ago.

immunooncology geneexpression rnaseq visualization alternativesplicing

4.30 score 1 scripts

bioc

RiboProfiling:Ribosome Profiling Data Analysis: from BAM to Data Representation and Interpretation

Starting with a BAM file, this package provides the necessary functions for quality assessment, read start position recalibration, the counting of reads on CDS, 3'UTR, and 5'UTR, plotting of count data: pairs, log fold-change, codon frequency and coverage assessment, principal component analysis on codon coverage.

Maintained by A. Popa. Last updated 5 months ago.

riboseq sequencing coverage alignment qualitycontrol software principalcomponent

4.30 score 10 scripts

bioc

Uniquorn:Identification of cancer cell lines based on their weighted mutational/ variational fingerprint

'Uniquorn' enables users to identify cancer cell lines. Cancer cell line misidentification and cross-contamination reprents a significant challenge for cancer researchers. The identification is vital and in the frame of this package based on the locations/ loci of somatic and germline mutations/ variations. The input format is vcf/ vcf.gz and the files have to contain a single cancer cell line sample (i.e. a single member/genotype/gt column in the vcf file).

Maintained by Raik Otto. Last updated 5 months ago.

immunooncology statisticalmethod wholegenome exomeseq

4.30 score

bioc

m6Aboost:m6Aboost

This package can help user to run the m6Aboost model on their own miCLIP2 data. The package includes functions to assign the read counts and get the features to run the m6Aboost model. The miCLIP2 data should be stored in a GRanges object. More details can be found in the vignette.

Maintained by You Zhou. Last updated 5 months ago.

sequencing epigenetics genetics experimenthubsoftware

2 stars 4.30 score 5 scripts

bioc

CAFE:Chromosmal Aberrations Finder in Expression data

Detection and visualizations of gross chromosomal aberrations using Affymetrix expression microarrays as input

Maintained by Sander Bollen. Last updated 5 months ago.

geneexpression microarray onechannel genesetenrichment

4.30 score 2 scripts

bioc

saseR:Scalable Aberrant Splicing and Expression Retrieval

saseR is a highly performant and fast framework for aberrant expression and splicing analyses. The main functions are: \itemize{ \item \code{\link{BamtoAspliCounts}} - Process BAM files to ASpli counts \item \code{\link{convertASpli}} - Get gene, bin or junction counts from ASpli SummarizedExperiment \item \code{\link{calculateOffsets}} - Create an offsets assays for aberrant expression or splicing analysis \item \code{\link{saseRfindEncodingDim}} - Estimate the optimal number of latent factors to include when estimating the mean expression \item \code{\link{saseRfit}} - Parameter estimation of the negative binomial distribution and compute p-values for aberrant expression and splicing } For information upon how to use these functions, check out our vignette at \url{https://github.com/statOmics/saseR/blob/main/vignettes/Vignette.Rmd} and the saseR paper: Segers, A. et al. (2023). Juggling offsets unlocks RNA-seq tools for fast scalable differential usage, aberrant splicing and expression analyses. bioRxiv. \url{https://doi.org/10.1101/2023.06.29.547014}.

Maintained by Alexandre Segers. Last updated 5 months ago.

differentialexpression differentialsplicing regression geneexpression alternativesplicing rnaseq sequencing software

1 stars 4.30 score 1 scripts

bioc

SigsPack:Mutational Signature Estimation for Single Samples

Single sample estimation of exposure to mutational signatures. Exposures to known mutational signatures are estimated for single samples, based on quadratic programming algorithms. Bootstrapping the input mutational catalogues provides estimations on the stability of these exposures. The effect of the sequence composition of mutational context can be taken into account by normalising the catalogues.

Maintained by Franziska Schumann. Last updated 5 months ago.

somaticmutation snp variantannotation biomedicalinformatics dnaseq

2 stars 4.30 score 4 scripts

bioc

compEpiTools:Tools for computational epigenomics

Tools for computational epigenomics developed for the analysis, integration and simultaneous visualization of various (epi)genomics data types across multiple genomic regions in multiple samples.

Maintained by Mattia Furlan. Last updated 5 months ago.

geneexpression sequencing visualization genomeannotation coverage

4.30 score 6 scripts

bioc

Ularcirc:Shiny app for canonical and back splicing analysis (i.e. circular and mRNA analysis)

Ularcirc reads in STAR aligned splice junction files and provides visualisation and analysis tools for splicing analysis. Users can assess backsplice junctions and forward canonical junctions.

Maintained by David Humphreys. Last updated 5 months ago.

datarepresentation visualization genetics sequencing annotation coverage alternativesplicing differentialsplicing

4.30 score 4 scripts

bioc

ChIPexoQual:ChIPexoQual

Package with a quality control pipeline for ChIP-exo/nexus data.

Maintained by Rene Welch. Last updated 5 months ago.

chipseq sequencing transcription visualization qualitycontrol coverage alignment

1 stars 4.30 score 5 scripts

bioc

BEAT:BEAT - BS-Seq Epimutation Analysis Toolkit

Model-based analysis of single-cell methylation data

Maintained by Kemal Akman. Last updated 5 months ago.

immunooncology genetics methylseq software dnamethylation epigenetics

4.30 score 3 scripts

bioc

cageminer:Candidate Gene Miner

This package aims to integrate GWAS-derived SNPs and coexpression networks to mine candidate genes associated with a particular phenotype. For that, users must define a set of guide genes, which are known genes involved in the studied phenotype. Additionally, the mined candidates can be given a score that favor candidates that are hubs and/or transcription factors. The scores can then be used to rank and select the top n most promising genes for downstream experiments.

Maintained by Fabrício Almeida-Silva. Last updated 5 months ago.

software snp functionalprediction genomewideassociation geneexpression networkenrichment variantannotation functionalgenomics network

1 stars 4.30 score 5 scripts

bioc

crisprBwa:BWA-based alignment of CRISPR gRNA spacer sequences

Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bwa. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Currently not supported on Windows machines.

Maintained by Jean-Philippe Fortin. Last updated 5 months ago.

crispr functionalgenomics alignment aligner bioconductor bioconductor-package bwa crispr-analysis crispr-cas9 crispr-design crispr-target grna grna-sequence grna-sequences sgrna sgrna-design

1 stars 4.30 score 6 scripts

bioc

iCNV:Integrated Copy Number Variation detection

Integrative copy number variation (CNV) detection from multiple platform and experimental design.

Maintained by Zilu Zhou. Last updated 5 months ago.

immunooncology exomeseq wholegenome snp copynumbervariation hiddenmarkovmodel

4.30 score 5 scripts

bioc

SOMNiBUS:Smooth modeling of bisulfite sequencing

This package aims to analyse count-based methylation data on predefined genomic regions, such as those obtained by targeted sequencing, and thus to identify differentially methylated regions (DMRs) that are associated with phenotypes or traits. The method is built a rich flexible model that allows for the effects, on the methylation levels, of multiple covariates to vary smoothly along genomic regions. At the same time, this method also allows for sequencing errors and can adjust for variability in cell type mixture.

Maintained by Kathleen Klein. Last updated 3 months ago.

dnamethylation regression epigenetics differentialmethylation sequencing functionalprediction

1 stars 4.30 score 3 scripts

bioc

regioneReloaded:RegioneReloaded: Multiple Association for Genomic Region Sets

RegioneReloaded is a package that allows simultaneous analysis of associations between genomic region sets, enabling clustering of data and the creation of ready-to-publish graphs. It takes over and expands on all the features of its predecessor regioneR. It also incorporates a strategy to improve p-value calculations and normalize z-scores coming from multiple analysis to allow for their direct comparison. RegioneReloaded builds upon regioneR by adding new plotting functions for obtaining publication-ready graphs.

Maintained by Roberto Malinverni. Last updated 5 months ago.

genetics chipseq dnaseq methylseq copynumbervariation clustering multiplecomparison

5 stars 4.30 score 2 scripts

bioc

spatzie:Identification of enriched motif pairs from chromatin interaction data

Identifies motifs that are significantly co-enriched from enhancer-promoter interaction data. While enhancer-promoter annotation is commonly used to define groups of interaction anchors, spatzie also supports co-enrichment analysis between preprocessed interaction anchors. Supports BEDPE interaction data derived from genome-wide assays such as HiC, ChIA-PET, and HiChIP. Can also be used to look for differentially enriched motif pairs between two interaction experiments.

Maintained by Jennifer Hammelman. Last updated 5 months ago.

dna3dstructure generegulation peakdetection epigenetics functionalgenomics classification hic transcription

4.30 score 5 scripts

bioc

InPAS:Identify Novel Alternative PolyAdenylation Sites (PAS) from RNA-seq data

Alternative polyadenylation (APA) is one of the important post- transcriptional regulation mechanisms which occurs in most human genes. InPAS facilitates the discovery of novel APA sites and the differential usage of APA sites from RNA-Seq data. It leverages cleanUpdTSeq to fine tune identified APA sites by removing false sites.

Maintained by Jianhong Ou. Last updated 3 months ago.

alternative polyadenylation differential polyadenylation site usage rna-seq gene regulation transcription

4.30 score 1 scripts

bioc

GOTHiC:Binomial test for Hi-C data analysis

This is a Hi-C analysis package using a cumulative binomial test to detect interactions between distal genomic loci that have significantly more reads than expected by chance in Hi-C experiments. It takes mapped paired NGS reads as input and gives back the list of significant interactions for a given bin size in the genome.

Maintained by Borbala Mifsud. Last updated 5 months ago.

immunooncology sequencing preprocessing epigenetics hic

4.30 score 6 scripts

bioc

RNAmodR.AlkAnilineSeq:Detection of m7G, m3C and D modification by AlkAnilineSeq

RNAmodR.AlkAnilineSeq implements the detection of m7G, m3C and D modifications on RNA from experimental data generated with the AlkAnilineSeq protocol. The package builds on the core functionality of the RNAmodR package to detect specific patterns of the modifications in high throughput sequencing data.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

software workflowstep visualization sequencing alkanilineseq bioconductor modifications rna rnamodr

2 stars 4.30 score 3 scripts