R-universe search: gtf

bioc

factR:Functional Annotation of Custom Transcriptomes

factR contain tools to process and interact with custom-assembled transcriptomes (GTF). At its core, factR constructs CDS information on custom transcripts and subsequently predicts its functional output. In addition, factR has tools capable of plotting transcripts, correcting chromosome and gene information and shortlisting new transcripts.

Maintained by Fursham Hamid. Last updated 5 months ago.

alternativesplicing functionalprediction geneprediction custom-transcriptomes functional-annotation gtf rna-seq-analysis

29.6 match 1 stars 4.00 score 5 scripts

bioc

BUSpaRse:kallisto | bustools R utilities

The kallisto | bustools pipeline is a fast and modular set of tools to convert single cell RNA-seq reads in fastq files into gene count or transcript compatibility counts (TCC) matrices for downstream analysis. Central to this pipeline is the barcode, UMI, and set (BUS) file format. This package serves the following purposes: First, this package allows users to manipulate BUS format files as data frames in R and then convert them into gene count or TCC matrices. Furthermore, since R and Rcpp code is easier to handle than pure C++ code, users are encouraged to tweak the source code of this package to experiment with new uses of BUS format and different ways to convert the BUS file into gene count matrix. Second, this package can conveniently generate files required to generate gene count matrices for spliced and unspliced transcripts for RNA velocity. Here biotypes can be filtered and scaffolds and haplotypes can be removed, and the filtered transcriptome can be extracted and written to disk. Third, this package implements utility functions to get transcripts and associated genes required to convert BUS files to gene count matrices, to write the transcript to gene information in the format required by bustools, and to read output of bustools into R as sparses matrices.

Maintained by Lambda Moses. Last updated 5 months ago.

singlecell rnaseq workflowstep cpp

11.4 match 9 stars 7.35 score 165 scripts

bioc

ORFik:Open Reading Frames in Genomics

R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.

Maintained by Haakon Tjeldnes. Last updated 27 days ago.

immunooncology software sequencing riboseq rnaseq functionalgenomics coverage alignment dataimport cpp

7.3 match 33 stars 10.63 score 115 scripts 2 dependents

bioc

bambu:Context-Aware Transcript Quantification from Long Read RNA-Seq data

bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.

Maintained by Ying Chen. Last updated 1 months ago.

alignment coverage differentialexpression featureextraction geneexpression genomeannotation genomeassembly immunooncology longread multiplecomparison normalization rnaseq regression sequencing software transcription transcriptomics bambu bioconductor long-reads nanopore nanopore-sequencing rna-seq rna-seq-analysis transcript-quantification transcript-reconstruction cpp

7.8 match 197 stars 9.03 score 91 scripts 1 dependents

ropensci

biomartr:Genomic Data Retrieval

Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.

Maintained by Hajk-Georg Drost. Last updated 1 months ago.

biomart genomic-data-retrieval annotation-retrieval database-retrieval ncbi ensembl biological-data-retrieval ensembl-servers genome genome-annotation genome-retrieval genomics meta-analysis metagenomics ncbi-genbank peer-reviewed proteome sequenced-genomes

5.0 match 218 stars 11.35 score 129 scripts 3 dependents

bioc

ballgown:Flexible, isoform-level differential expression analysis

Tools for statistical analysis of assembled transcriptomes, including flexible differential expression analysis, visualization of transcript structures, and matching of assembled transcripts to annotation.

Maintained by Jack Fu. Last updated 5 months ago.

immunooncology rnaseq statisticalmethod preprocessing differentialexpression

5.2 match 146 stars 10.51 score 338 scripts 1 dependents

fischuu

GenomicTools.fileHandler:File Handlers for Genomic Data Analysis

A collection of I/O tools for handling the most commonly used genomic datafiles, like fasta/-q, bed, gff, gtf, ped/map and vcf.

Maintained by Daniel Fischer. Last updated 1 months ago.

12.1 match 4.48 score 4 scripts 2 dependents

bioc

IsoformSwitchAnalyzeR:Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data

Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.

Maintained by Kristoffer Vitting-Seerup. Last updated 5 months ago.

geneexpression transcription alternativesplicing differentialexpression differentialsplicing visualization statisticalmethod transcriptomevariant biomedicalinformatics functionalgenomics systemsbiology transcriptomics rnaseq annotation functionalprediction geneprediction dataimport multiplecomparison batcheffect immunooncology

4.7 match 108 stars 9.26 score 125 scripts

bioc

GeneStructureTools:Tools for spliced gene structure manipulation and analysis

GeneStructureTools can be used to create in silico alternative splicing events, and analyse potential effects this has on functional gene products.

Maintained by Beth Signal. Last updated 5 months ago.

immunooncology software differentialsplicing functionalprediction transcriptomics alternativesplicing rnaseq

9.5 match 4.32 score 21 scripts

bioc

circRNAprofiler:circRNAprofiler: An R-Based Computational Framework for the Downstream Analysis of Circular RNAs

R-based computational framework for a comprehensive in silico analysis of circRNAs. This computational framework allows to combine and analyze circRNAs previously detected by multiple publicly available annotation-based circRNA detection tools. It covers different aspects of circRNAs analysis from differential expression analysis, evolutionary conservation, biogenesis to functional analysis.

Maintained by Simona Aufiero. Last updated 5 months ago.

annotation structuralprediction functionalprediction geneprediction genomeassembly differentialexpression

6.6 match 10 stars 5.78 score 5 scripts

bioc

dupRadar:Assessment of duplication rates in RNA-Seq datasets

Duplication rate quality control for RNA-Seq datasets.

Maintained by Sergi Sayols. Last updated 5 months ago.

technology sequencing rnaseq qualitycontrol immunooncology

4.4 match 2 stars 6.78 score 60 scripts

bioc

GenomicPlot:Plot profiles of next generation sequencing data in genomic features

Visualization of next generation sequencing (NGS) data is essential for interpreting high-throughput genomics experiment results. 'GenomicPlot' facilitates plotting of NGS data in various formats (bam, bed, wig and bigwig); both coverage and enrichment over input can be computed and displayed with respect to genomic features (such as UTR, CDS, enhancer), and user defined genomic loci or regions. Statistical tests on signal intensity within user defined regions of interest can be performed and represented as boxplots or bar graphs. Parallel processing is used to speed up computation on multicore platforms. In addition to genomic plots which is suitable for displaying of coverage of genomic DNA (such as ChIPseq data), metagenomic (without introns) plots can also be made for RNAseq or CLIPseq data as well.

Maintained by Shuye Pu. Last updated 2 months ago.

alternativesplicing chipseq coverage geneexpression rnaseq sequencing software transcription visualization annotation

5.1 match 3 stars 5.62 score 4 scripts

bioc

GenomicFeatures:Query the gene models of a given organism/assembly

Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.

Maintained by H. Pagès. Last updated 4 months ago.

genetics infrastructure annotation sequencing genomeannotation bioconductor-package core-package

1.7 match 26 stars 15.34 score 5.3k scripts 339 dependents

bioc

GenomicDistributions:GenomicDistributions: fast analysis of genomic intervals with Bioconductor

If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.

Maintained by Kristyna Kupkova. Last updated 5 months ago.

software genomeannotation genomeassembly datarepresentation sequencing coverage functionalgenomics visualization

3.4 match 26 stars 7.44 score 25 scripts

bioc

eisaR:Exon-Intron Split Analysis (EISA) in R

Exon-intron split analysis (EISA) uses ordinary RNA-seq data to measure changes in mature RNA and pre-mRNA reads across different experimental conditions to quantify transcriptional and post-transcriptional regulation of gene expression. For details see Gaidatzis et al., Nat Biotechnol 2015. doi: 10.1038/nbt.3269. eisaR implements the major steps of EISA in R.

Maintained by Michael Stadler. Last updated 2 months ago.

transcription geneexpression generegulation functionalgenomics transcriptomics regression rnaseq

3.3 match 16 stars 7.48 score 63 scripts

bioc

plyranges:A fluent interface for manipulating GenomicRanges

A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.

Maintained by Michael Love. Last updated 5 months ago.

infrastructure datarepresentation workflowstep coverage bioconductor data-analysis dplyr genomic-ranges genomics tidy-data

1.8 match 143 stars 12.60 score 1.9k scripts 20 dependents

bioc

SCANVIS:SCANVIS - a tool for SCoring, ANnotating and VISualizing splice junctions

SCANVIS is a set of annotation-dependent tools for analyzing splice junctions and their read support as predetermined by an alignment tool of choice (for example, STAR aligner). SCANVIS assesses each junction's relative read support (RRS) by relating to the context of local split reads aligning to annotated transcripts. SCANVIS also annotates each splice junction by indicating whether the junction is supported by annotation or not, and if not, what type of junction it is (e.g. exon skipping, alternative 5' or 3' events, Novel Exons). Unannotated junctions are also futher annotated by indicating whether it induces a frame shift or not. SCANVIS includes a visualization function to generate static sashimi-style plots depicting relative read support and number of split reads using arc thickness and arc heights, making it easy for users to spot well-supported junctions. These plots also clearly delineate unannotated junctions from annotated ones using designated color schemes, and users can also highlight splice junctions of choice. Variants and/or a read profile are also incoroporated into the plot if the user supplies variants in bed format and/or the BAM file. One further feature of the visualization function is that users can submit multiple samples of a certain disease or cohort to generate a single plot - this occurs via a "merge" function wherein junction details over multiple samples are merged to generate a single sashimi plot, which is useful when contrasting cohorots (eg. disease vs control).

Maintained by Phaedra Agius. Last updated 5 months ago.

software researchfield transcriptomics workflowstep annotation visualization

4.9 match 4.00 score 2 scripts

bioc

syntenet:Inference And Analysis Of Synteny Networks

syntenet can be used to infer synteny networks from whole-genome protein sequences and analyze them. Anchor pairs are detected with the MCScanX algorithm, which was ported to this package with the Rcpp framework for R and C++ integration. Anchor pairs from synteny analyses are treated as an undirected unweighted graph (i.e., a synteny network), and users can perform: i. network clustering; ii. phylogenomic profiling (by identifying which species contain which clusters) and; iii. microsynteny-based phylogeny reconstruction with maximum likelihood.

Maintained by Fabrício Almeida-Silva. Last updated 3 months ago.

software networkinference functionalgenomics comparativegenomics phylogenetics systemsbiology graphandnetwork wholegenome network comparative-genomics evolutionary-genomics network-science phylogenomics synteny synteny-network cpp

2.9 match 26 stars 6.67 score 12 scripts 1 dependents

bioc

ensembldb:Utilities to create and use Ensembl-based annotation databases

The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.

Maintained by Johannes Rainer. Last updated 5 months ago.

genetics annotationdata sequencing coverage annotation bioconductor bioconductor-packages ensembl

1.3 match 35 stars 14.08 score 892 scripts 108 dependents

thackl

gggenomes:A Grammar of Graphics for Comparative Genomics

An extension of 'ggplot2' for creating complex genomic maps. It builds on the power of 'ggplot2' and 'tidyverse' adding new 'ggplot2'-style geoms & positions and 'dplyr'-style verbs to manipulate the underlying data. It implements a layout concept inspired by 'ggraph' and introduces tracks to bring tidiness to the mess that is genomics data.

Maintained by Thomas Hackl. Last updated 2 months ago.

biological-data comparative-genomics genomics-visualization ggplot-extension ggplot2

1.7 match 650 stars 9.56 score 123 scripts

bioc

txdbmaker:Tools for making TxDb objects from genomic annotations

A set of tools for making TxDb objects from genomic annotations from various sources (e.g. UCSC, Ensembl, and GFF files). These tools allow the user to download the genomic locations of transcripts, exons, and CDS, for a given assembly, and to import them in a TxDb object. TxDb objects are implemented in the GenomicFeatures package, together with flexible methods for extracting the desired features in convenient formats.

Maintained by H. Pagès. Last updated 4 months ago.

infrastructure dataimport annotation genomeannotation genomeassembly genetics sequencing bioconductor-package core-package

1.7 match 3 stars 9.70 score 92 scripts 86 dependents

bioc

Rsubread:Mapping, quantification and variant analysis of sequencing data

Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.

Maintained by Wei Shi. Last updated 2 days ago.

sequencing alignment sequencematching rnaseq chipseq singlecell geneexpression generegulation genetics immunooncology snp geneticvariability preprocessing qualitycontrol genomeannotation genefusiondetection indeldetection variantannotation variantdetection multiplesequencealignment zlib

1.8 match 9.24 score 892 scripts 10 dependents

rnabioco

valr:Genome Interval Arithmetic

Read and manipulate genome intervals and signals. Provides functionality similar to command-line tool suites within R, enabling interactive analysis and visualization of genome-scale data. Riemondy et al. (2017) <doi:10.12688/f1000research.11997.1>.

Maintained by Kent Riemondy. Last updated 7 days ago.

bedtools genome interval-arithmetic cpp

1.7 match 90 stars 9.69 score 227 scripts

bioc

partCNV:Infer locally aneuploid cells using single cell RNA-seq data

This package uses a statistical framework for rapid and accurate detection of aneuploid cells with local copy number deletion or amplification. Our method uses an EM algorithm with mixtures of Poisson distributions while incorporating cytogenetics information (e.g., regional deletion or amplification) to guide the classification (partCNV). When applicable, we further improve the accuracy by integrating a Hidden Markov Model for feature selection (partCNVH).

Maintained by Ziyi Li. Last updated 5 months ago.

software copynumbervariation hiddenmarkovmodel singlecell classification

3.8 match 4.18 score 4 scripts

bioc

AnnotationHub:Client to access AnnotationHub resources

This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure dataimport gui thirdpartyclient core-package u24ca289073

1.1 match 17 stars 13.89 score 2.7k scripts 102 dependents

pablobio

GALLO:Genomic Annotation in Livestock for Positional Candidate LOci

The accurate annotation of genes and Quantitative Trait Loci (QTLs) located within candidate markers and/or regions (haplotypes, windows, CNVs, etc) is a crucial step the most common genomic analyses performed in livestock, such as Genome-Wide Association Studies or transcriptomics. The Genomic Annotation in Livestock for positional candidate LOci (GALLO) is an R package designed to provide an intuitive and straightforward environment to annotate positional candidate genes and QTLs from high-throughput genetic studies in livestock. Moreover, GALLO allows the graphical visualization of gene and QTL annotation results, data comparison among different grouping factors (e.g., methods, breeds, tissues, statistical models, studies, etc.), and QTL enrichment in different livestock species including cattle, pigs, sheep, and chicken, among others.

Maintained by Pablo Fonseca. Last updated 4 years ago.

software

3.5 match 10 stars 4.33 score 3 scripts

bioc

FLAMES:FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data

Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.

Maintained by Changqing Wang. Last updated 6 days ago.

rnaseq singlecell transcriptomics dataimport differentialsplicing alternativesplicing geneexpression longread zlib curl bzip2 xz-utils cpp

1.9 match 31 stars 7.95 score 12 scripts

cran

Rgff:R Utilities for GFF Files

R utilities for gff files, either general feature format (GFF3) or gene transfer format (GTF) formatted files. This package includes functions for producing summary stats, check for consistency and sorting errors, conversion from GTF to GFF3 format, file sorting, visualization and plotting of feature hierarchy, and exporting user defined feature subsets to SAF format. This tool was developed by the BioinfoGP core facility at CNB-CSIC.

Maintained by Juan Antonio Garcia-Martin. Last updated 2 years ago.

7.3 match 2.00 score

bioc

BREW3R.r:R package associated to BREW3R

This R package provide functions that are used in the BREW3R workflow. This mainly contains a function that extend a gtf as GRanges using information from another gtf (also as GRanges). The process allows to extend gene annotation without increasing the overlap between gene ids.

Maintained by Lucille Lopez-Delisle. Last updated 5 months ago.

genomeannotation

3.3 match 4.30 score 6 scripts

bnprks

BPCells:Single Cell Counts Matrices to PCA

> Efficient operations for single cell ATAC-seq fragments and RNA counts matrices. Interoperable with standard file formats, and introduces efficient bit-packed formats that allow large storage savings and increased read speeds.

Maintained by Benjamin Parks. Last updated 1 months ago.

zlib hdf5 cpp

1.9 match 184 stars 7.48 score 172 scripts

bioc

geneXtendeR:Optimized Functional Annotation Of ChIP-seq Data

geneXtendeR optimizes the functional annotation of ChIP-seq peaks by exploring relative differences in annotating ChIP-seq peak sets to variable-length gene bodies. In contrast to prior techniques, geneXtendeR considers peak annotations beyond just the closest gene, allowing users to see peak summary statistics for the first-closest gene, second-closest gene, ..., n-closest gene whilst ranking the output according to biologically relevant events and iteratively comparing the fidelity of peak-to-gene overlap across a user-defined range of upstream and downstream extensions on the original boundaries of each gene's coordinates. Since different ChIP-seq peak callers produce different differentially enriched peaks with a large variance in peak length distribution and total peak count, annotating peak lists with their nearest genes can often be a noisy process. As such, the goal of geneXtendeR is to robustly link differentially enriched peaks with their respective genes, thereby aiding experimental follow-up and validation in designing primers for a set of prospective gene candidates during qPCR.

Maintained by Bohdan Khomtchouk. Last updated 5 months ago.

chipseq genetics annotation genomeannotation differentialpeakcalling coverage peakdetection chiponchip histonemodification dataimport naturallanguageprocessing visualization go software bioconductor bioinformatics c chip-seq computational-biology epigenetics functional-annotation

3.3 match 9 stars 3.95 score 5 scripts

bioc

chimeraviz:Visualization tools for gene fusions

chimeraviz manages data from fusion gene finders and provides useful visualization tools.

Maintained by Stian Lågstad. Last updated 5 months ago.

infrastructure alignment

1.9 match 37 stars 6.71 score 14 scripts

bioc

proActiv:Estimate Promoter Activity from RNA-Seq data

Most human genes have multiple promoters that control the expression of different isoforms. The use of these alternative promoters enables the regulation of isoform expression pre-transcriptionally. Alternative promoters have been found to be important in a wide number of cell types and diseases. proActiv is an R package that enables the analysis of promoters from RNA-seq data. proActiv uses aligned reads as input, and generates counts and normalized promoter activity estimates for each annotated promoter. In particular, proActiv accepts junction files from TopHat2 or STAR or BAM files as inputs. These estimates can then be used to identify which promoter is active, which promoter is inactive, and which promoters change their activity across conditions. proActiv also allows visualization of promoter activity across conditions.

Maintained by Joseph Lee. Last updated 5 months ago.

rnaseq geneexpression transcription alternativesplicing generegulation differentialsplicing functionalgenomics epigenetics transcriptomics preprocessing alternative-promoters genomics promoter-activity promoter-annotation rna-seq-data

1.8 match 51 stars 6.66 score 15 scripts

bioc

autonomics:Unified Statistical Modeling of Omics Data

This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.

Maintained by Aditya Bhagwat. Last updated 2 months ago.

software dataimport preprocessing dimensionreduction principalcomponent regression differentialexpression genesetenrichment transcriptomics transcription geneexpression rnaseq microarray proteomics metabolomics massspectrometry

2.0 match 5.95 score 5 scripts

bioc

srnadiff:Finding differentially expressed unannotated genomic regions from RNA-seq data

srnadiff is a package that finds differently expressed regions from RNA-seq data at base-resolution level without relying on existing annotation. To do so, the package implements the identify-then-annotate methodology that builds on the idea of combining two pipelines approachs differential expressed regions detection and differential expression quantification. It reads BAM files as input, and outputs a list differentially regions, together with the adjusted p-values.

Maintained by Zytnicki Matthias. Last updated 2 months ago.

immunooncology geneexpression coverage smallrna epigenetics statisticalmethod preprocessing differentialexpression cpp

2.9 match 3.70 score 3 scripts

bioc

DEXSeq:Inference of differential exon usage in RNA-Seq

The package is focused on finding differential exon usage using RNA-seq exon counts between samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates and generalized linear models for testing. The package also provides functions for the visualization and exploration of the results.

Maintained by Alejandro Reyes. Last updated 16 days ago.

immunooncology sequencing rnaseq differentialexpression alternativesplicing differentialsplicing geneexpression visualization

1.3 match 7.75 score 330 scripts 6 dependents

bioc

metaseqR2:An R package for the analysis and result reporting of RNA-Seq data by combining multiple statistical algorithms

Provides an interface to several normalization and statistical testing packages for RNA-Seq gene expression data. Additionally, it creates several diagnostic plots, performs meta-analysis by combinining the results of several statistical tests and reports the results in an interactive way.

Maintained by Panagiotis Moulos. Last updated 5 days ago.

software geneexpression differentialexpression workflowstep preprocessing qualitycontrol normalization reportwriting rnaseq transcription sequencing transcriptomics bayesian clustering cellbiology biomedicalinformatics functionalgenomics systemsbiology immunooncology alternativesplicing differentialsplicing multiplecomparison timecourse dataimport atacseq epigenetics regression proprietaryplatforms genesetenrichment batcheffect chipseq

1.7 match 7 stars 6.05 score 3 scripts

jmzeng1314

AnnoProbe:annotate the gene symbols for probes in expression array

We curated 147 of expression array, from 3 species(human,mouse,rat), 3 companies(affymetrix,illumina,agilent), by aligning the fasta sequences of all probes of each platform to their corresponding reference genome, and then annotate them to genes.

Maintained by The package maintainer. Last updated 5 years ago.

1.7 match 104 stars 5.82 score 126 scripts

bioc

txcutr:Transcriptome CUTteR

Various mRNA sequencing library preparation methods generate sequencing reads specifically from the transcript ends. Analyses that focus on quantification of isoform usage from such data can be aided by using truncated versions of transcriptome annotations, both at the alignment or pseudo-alignment stage, as well as in downstream analysis. This package implements some convenience methods for readily generating such truncated annotations and their corresponding sequences.

Maintained by Mervin Fansler. Last updated 5 months ago.

alignment annotation rnaseq sequencing transcriptomics

2.3 match 4.30 score 9 scripts

bioc

easyRNASeq:Count summarization and normalization for RNA-Seq data

Calculates the coverage of high-throughput short-reads against a genome of reference and summarizes it per feature of interest (e.g. exon, gene, transcript). The data can be normalized as 'RPKM' or by the 'DESeq' or 'edgeR' package.

Maintained by Nicolas Delhomme. Last updated 5 months ago.

geneexpression rnaseq genetics preprocessing immunooncology

1.7 match 5.43 score 15 scripts 1 dependents

bioc

CeTF:Coexpression for Transcription Factors using Regulatory Impact Factors and Partial Correlation and Information Theory analysis

This package provides the necessary functions for performing the Partial Correlation coefficient with Information Theory (PCIT) (Reverter and Chan 2008) and Regulatory Impact Factors (RIF) (Reverter et al. 2010) algorithm. The PCIT algorithm identifies meaningful correlations to define edges in a weighted network and can be applied to any correlation-based network including but not limited to gene co-expression networks, while the RIF algorithm identify critical Transcription Factors (TF) from gene expression data. These two algorithms when combined provide a very relevant layer of information for gene expression studies (Microarray, RNA-seq and single-cell RNA-seq data).

Maintained by Carlos Alberto Oliveira de Biagi Junior. Last updated 5 months ago.

sequencing rnaseq microarray geneexpression transcription normalization differentialexpression singlecell network regression chipseq immunooncology coverage cpp

2.0 match 4.30 score 9 scripts

bioc

recoup:An R package for the creation of complex genomic profile plots

recoup calculates and plots signal profiles created from short sequence reads derived from Next Generation Sequencing technologies. The profiles provided are either sumarized curve profiles or heatmap profiles. Currently, recoup supports genomic profile plots for reads derived from ChIP-Seq and RNA-Seq experiments. The package uses ggplot2 and ComplexHeatmap graphics facilities for curve and heatmap coverage profiles respectively.

Maintained by Panagiotis Moulos. Last updated 5 months ago.

immunooncology software geneexpression preprocessing qualitycontrol rnaseq chipseq sequencing coverage atacseq chiponchip alignment dataimport

1.7 match 1 stars 5.02 score 2 scripts

bioc

RCAS:RNA Centric Annotation System

RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.

Maintained by Bora Uyar. Last updated 5 months ago.

software genetarget motifannotation motifdiscovery go transcriptomics genomeannotation genesetenrichment coverage

1.3 match 6.32 score 29 scripts 1 dependents

bioc

compEpiTools:Tools for computational epigenomics

Tools for computational epigenomics developed for the analysis, integration and simultaneous visualization of various (epi)genomics data types across multiple genomic regions in multiple samples.

Maintained by Mattia Furlan. Last updated 5 months ago.

geneexpression sequencing visualization genomeannotation coverage

1.8 match 4.30 score 6 scripts

bioc

NoRCE:NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment

While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast.

Maintained by Gulden Olgun. Last updated 5 months ago.

biologicalquestion differentialexpression genomeannotation genesetenrichment genetarget genomeassembly go

1.7 match 1 stars 4.60 score 6 scripts

bioc

ExCluster:ExCluster robustly detects differentially expressed exons between two conditions of RNA-seq data, requiring at least two independent biological replicates per condition

ExCluster flattens Ensembl and GENCODE GTF files into GFF files, which are used to count reads per non-overlapping exon bin from BAM files. This read counting is done using the function featureCounts from the package Rsubread. Library sizes are normalized across all biological replicates, and ExCluster then compares two different conditions to detect signifcantly differentially spliced genes. This process requires at least two independent biological repliates per condition, and ExCluster accepts only exactly two conditions at a time. ExCluster ultimately produces false discovery rates (FDRs) per gene, which are used to detect significance. Exon log2 fold change (log2FC) means and variances may be plotted for each significantly differentially spliced gene, which helps scientists develop hypothesis and target differential splicing events for RT-qPCR validation in the wet lab.

Maintained by R. Matthew Tanner. Last updated 5 months ago.

immunooncology differentialsplicing rnaseq software

2.3 match 3.30 score 1 scripts

bioc

sitadela:An R package for the easy provision of simple but complete tab-delimited genomic annotation from a variety of sources and organisms

Provides an interface to build a unified database of genomic annotations and their coordinates (gene, transcript and exon levels). It is aimed to be used when simple tab-delimited annotations (or simple GRanges objects) are required instead of the more complex annotation Bioconductor packages. Also useful when combinatorial annotation elements are reuired, such as RefSeq coordinates with Ensembl biotypes. Finally, it can download, construct and handle annotations with versioned genes and transcripts (where available, e.g. RefSeq and latest Ensembl). This is particularly useful in precision medicine applications where the latter must be reported.

Maintained by Panagiotis Moulos. Last updated 5 months ago.

software workflowstep rnaseq transcription sequencing transcriptomics biomedicalinformatics functionalgenomics systemsbiology alternativesplicing dataimport chipseq

1.7 match 4.60 score 2 scripts

bioc

SpliceWiz:interactive analysis and visualization of alternative splicing in R

The analysis and visualization of alternative splicing (AS) events from RNA sequencing data remains challenging. SpliceWiz is a user-friendly and performance-optimized R package for AS analysis, by processing alignment BAM files to quantify read counts across splice junctions, IRFinder-based intron retention quantitation, and supports novel splicing event identification. We introduce a novel visualization for AS using normalized coverage, thereby allowing visualization of differential AS across conditions. SpliceWiz features a shiny-based GUI facilitating interactive data exploration of results including gene ontology enrichment. It is performance optimized with multi-threaded processing of BAM files and a new COV file format for fast recall of sequencing coverage. Overall, SpliceWiz streamlines AS analysis, enabling reliable identification of functionally relevant AS events for further characterization.

Maintained by Alex Chit Hei Wong. Last updated 4 days ago.

software transcriptomics rnaseq alternativesplicing coverage differentialsplicing differentialexpression gui sequencing cpp openmp

1.1 match 16 stars 6.41 score 8 scripts

core-bioinformatics

noisyr:Noise Quantification in High Throughput Sequencing Output

Quantifies and removes technical noise from high-throughput sequencing data. Two approaches are used, one based on the count matrix, and one using the alignment BAM files directly. Contains several options for every step of the process, as well as tools to quality check and assess the stability of output.

Maintained by Ilias Moutsopoulos. Last updated 3 years ago.

1.7 match 9 stars 4.13 score 5 scripts 1 dependents

li081766

shinyWGD:'Shiny' Application for Whole Genome Duplication Analysis

Provides a comprehensive 'Shiny' application for analyzing Whole Genome Duplication ('WGD') events. This package provides a user-friendly 'Shiny' web application for non-experienced researchers to prepare input data and execute command lines for several well-known 'WGD' analysis tools, including 'wgd', 'ksrates', 'i-ADHoRe', 'OrthoFinder', and 'Whale'. This package also provides the source code for experienced researchers to adjust and install the package to their own server. Key Features 1) Input Data Preparation This package allows users to conveniently upload and format their data, making it compatible with various 'WGD' analysis tools. 2) Command Line Generation This package automatically generates the necessary command lines for selected 'WGD' analysis tools, reducing manual errors and saving time. 3) Visualization This package offers interactive visualizations to explore and interpret 'WGD' results, facilitating in-depth 'WGD' analysis. 4) Comparative Genomics Users can study and compare 'WGD' events across different species, aiding in evolutionary and comparative genomics studies. 5) User-Friendly Interface This 'Shiny' web application provides an intuitive and accessible interface, making 'WGD' analysis accessible to researchers and 'bioinformaticians' of all levels.

Maintained by Jia Li. Last updated 4 months ago.

1.8 match 3 stars 3.95 score 3 scripts

fmicompbio

swissknife:Handy code shared in the FMI CompBio group

A collection of useful R functions performing various tasks that might be re-usable and worth sharing.

Maintained by Michael Stadler. Last updated 2 months ago.

cpp

1.8 match 8 stars 3.76 score 12 scripts

bioc

ProteoDisco:Generation of customized protein variant databases from genomic variants, splice-junctions and manual sequences

ProteoDisco is an R package to facilitate proteogenomics studies. It houses functions to create customized (variant) protein databases based on user-submitted genomic variants, splice-junctions, fusion genes and manual transcript sequences. The flexible workflow can be adopted to suit a myriad of research and experimental settings.

Maintained by Job van Riet. Last updated 5 months ago.

software proteomics rnaseq snp sequencing variantannotation dataimport

1.3 match 5 stars 5.30 score 4 scripts

bioc

rGenomeTracks:Integerated visualization of epigenomic data

rGenomeTracks package leverages the power of pyGenomeTracks software with the interactivity of R. pyGenomeTracks is a python software that offers robust method for visualizing epigenetic data files like narrowPeak, Hic matrix, TADs and arcs, however though, here is no way currently to use it within R interactive session. rGenomeTracks wrapped the whole functionality of pyGenomeTracks with additional utilites to make to more pleasant for R users.

Maintained by Omar Elashkar. Last updated 5 months ago.

software hic visualization

2.0 match 3.30 score 2 scripts

bioc

APAlyzer:A toolkit for APA analysis using RNA-seq data

Perform 3'UTR APA, Intronic APA and gene expression analysis using RNA-seq data.

Maintained by Ruijia Wang. Last updated 5 months ago.

sequencing rnaseq differentialexpression geneexpression generegulation annotation dataimport software ative-polyadenylation bioinformatics-tool rna-seq

1.1 match 8 stars 5.81 score 9 scripts

bioc

branchpointer:Prediction of intronic splicing branchpoints

Predicts branchpoint probability for sites in intronic branchpoint windows. Queries can be supplied as intronic regions; or to evaluate the effects of mutations, SNPs.

Maintained by Beth Signal. Last updated 5 months ago.

software genomeannotation genomicvariation motifannotation

1.8 match 3.62 score 21 scripts

bioc

metagene2:A package to produce metagene plots

This package produces metagene plots to compare coverages of sequencing experiments at selected groups of genomic regions. It can be used for such analyses as assessing the binding of DNA-interacting proteins at promoter regions or surveying antisense transcription over the length of a gene. The metagene2 package can manage all aspects of the analysis, from normalization of coverages to plot facetting according to experimental metadata. Bootstraping analysis is used to provide confidence intervals of per-sample mean coverages.

Maintained by Eric Fournier. Last updated 5 months ago.

chipseq genetics multiplecomparison coverage alignment sequencing

1.1 match 4 stars 5.45 score 8 scripts

bioc

DegNorm:DegNorm: degradation normalization for RNA-seq data

This package performs degradation normalization in bulk RNA-seq data to improve differential expression analysis accuracy.

Maintained by Ji-Ping Wang. Last updated 5 months ago.

rnaseq normalization geneexpression alignment coverage differentialexpression batcheffect software sequencing immunooncology qualitycontrol dataimport openblas cpp openmp

1.2 match 1 stars 5.20 score 3 scripts

bioc

SCAN.UPC:Single-channel array normalization (SCAN) and Universal exPression Codes (UPC)

SCAN is a microarray normalization method to facilitate personalized-medicine workflows. Rather than processing microarray samples as groups, which can introduce biases and present logistical challenges, SCAN normalizes each sample individually by modeling and removing probe- and array-specific background noise using only data from within each array. SCAN can be applied to one-channel (e.g., Affymetrix) or two-channel (e.g., Agilent) microarrays. The Universal exPression Codes (UPC) method is an extension of SCAN that estimates whether a given gene/transcript is active above background levels in a given sample. The UPC method can be applied to one-channel or two-channel microarrays as well as to RNA-Seq read counts. Because UPC values are represented on the same scale and have an identical interpretation for each platform, they can be used for cross-platform data integration.

Maintained by Stephen R. Piccolo. Last updated 5 months ago.

immunooncology software microarray preprocessing rnaseq twochannel onechannel

1.7 match 3.48 score 15 scripts

bioc

TransView:Read density map construction and accession. Visualization of ChIPSeq and RNASeq data sets

This package provides efficient tools to generate, access and display read densities of sequencing based data sets such as from RNA-Seq and ChIP-Seq.

Maintained by Julius Muller. Last updated 2 months ago.

immunooncology dnamethylation geneexpression transcription microarray sequencing chipseq rnaseq methylseq dataimport visualization clustering multiplecomparison curl bzip2 xz-utils zlib

2.0 match 2.60 score

cran

MARVEL:Revealing Splicing Dynamics at Single-Cell Resolution

Alternative splicing represents an additional and underappreciated layer of complexity underlying gene expression profiles. Nevertheless, there remains hitherto a paucity of software to investigate splicing dynamics at single-cell resolution. 'MARVEL' enables splicing analysis of single-cell RNA-sequencing data generated from plate- and droplet-based library preparation methods.

Maintained by Sean Wen. Last updated 2 years ago.

1.8 match 2.71 score 51 scripts

totajuliusd

enshuman:Human Gene Annotation Data from 'Ensembl'

Gene information from 'Ensembl' genome builds 'GRCh38.p14' and 'GRCh37.p13' to use with the 'topr' package. The datasets were originally downloaded from <https://ftp.ensembl.org/pub/current/gtf/homo_sapiens/Homo_sapiens.GRCh38.111.gtf.gz> and <https://ftp.ensembl.org/pub/grch37/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.87.gtf.gz> and converted into the format required by the 'topr' package. See <https://github.com/totajuliusd/topr?tab=readme-ov-file#how-to-use-topr-with-other-species-than-human> to see the required format.

Maintained by Thorhildur Juliusdottir. Last updated 1 years ago.

1.0 match 3.18 score 1 scripts 1 dependents

hanjunwei-lab

pathwayTMB:Pathway Based Tumor Mutational Burden

A systematic bioinformatics tool to develop a new pathway-based gene panel for tumor mutational burden (TMB) assessment (pathway-based tumor mutational burden, PTMB), using somatic mutations files in an efficient manner from either The Cancer Genome Atlas sources or any in-house studies as long as the data is in mutation annotation file (MAF) format. Besides, we develop a multiple machine learning method using the sample's PTMB profiles to identify cancer-specific dysfunction pathways, which can be a biomarker of prognostic and predictive for cancer immunotherapy.

Maintained by Junwei Han. Last updated 3 years ago.

1.1 match 2.48 score 2 scripts 1 dependents

bioc

epistack:Heatmaps of Stack Profiles from Epigenetic Signals

The epistack package main objective is the visualizations of stacks of genomic tracks (such as, but not restricted to, ChIP-seq, ATAC-seq, DNA methyation or genomic conservation data) centered at genomic regions of interest. epistack needs three different inputs: 1) a genomic score objects, such as ChIP-seq coverage or DNA methylation values, provided as a `GRanges` (easily obtained from `bigwig` or `bam` files). 2) a list of feature of interest, such as peaks or transcription start sites, provided as a `GRanges` (easily obtained from `gtf` or `bed` files). 3) a score to sort the features, such as peak height or gene expression value.

Maintained by DEVAILLY Guillaume. Last updated 5 months ago.

rnaseq preprocessing chipseq geneexpression coverage bioinformatics

0.5 match 6 stars 5.26 score 5 scripts

bioc

proBAMr:Generating SAM file for PSMs in shotgun proteomics data

Mapping PSMs back to genome. The package builds SAM file from shotgun proteomics data The package also provides function to prepare annotation from GTF file.

Maintained by Xiaojing Wang. Last updated 5 months ago.

immunooncology proteomics massspectrometry software visualization

0.5 match 3.48 score 1 scripts