R-universe search: topic:genomeannotation

package

owner

contributor

author

maintainer

topic

needs

exports

data

Currently serving26341packages,22657articles, and64224datasets by1265organizations,13662 maintainers and22191 contributors.

Not sure what to search for? Why not try:maps, bayesian, ecology, climate, genome, gam, spatial, database, pdf, shiny, rstudio, machine learning, prediction, birds, fish, sports, ... (more popular topics)

Organizations

vimc

lcbc-uio

stan-dev

pharmaverse

r-spatial

tidyverse

ropengov

rstudio

r-lib

ropensci

bioc

r-forge

kwb-r

pik-piam

hypertidy

poissonconsulting

mrc-ide

tidymodels

pecanproject

insightsengineering

thinkr-open

mlr-org

inbo

ggseg

ohdsi

modeloriented

paws-r

predictiveecology

flr

ropenspain

bnosac

sciviews

openvolley

rmi-pacta

repboxr

mrcieu

epiverse-trace

nlmixr2

yulab-smu

ices-tools-prod

frbcesab

azure

riatelab

statnet

mlverse

bips-hb

appsilon

cloudyr

epiforecasts

rjdverse

tmsalab

openpharma

usaid-oha-si

bupaverse

hubverse-org

usepa

dreamrs

darwin-eu

easystats

merck

ambiorix-web

business-science

certe-medical-epidemiology

coatless-rpkg

spatstat

rsquaredacademy

bluegreen-labs

traitecoevo

nutriverse

rikenbit

r-dbi

uscbiostats

hugheylab

aus-doh-safety-and-quality

terminological

data-cleaning

gesistsa

reconhub

apache

nflverse

cogdisreslab

ipeagit

ctu-bern

rspatial

ocbe-uio

epicentre-msf

humaniverse

biometris

ifpri

rinterface

gamlss-dev

kharchenkolab

cleanzr

winvector

stscl

piecepackr

Want to learn more about r-universe? Have a look atropensci.org/r-universeor updates from the rOpenSci blog:

Better documentation for R-universe!February 28, 2025
R-Universe Named an R Consortium Top-Level ProjectDecember 3, 2024
Capturing Screenshots Programmatically With RSeptember 10, 2024
Navigating the R ecosystem using R-universeSeptember 24, 2024
A fresh new look for R-universe!June 12, 2024
R-Universe Documentation Gets a Boost from Google Season of DocsApril 12, 2024
R-universe now builds MacOS ARM64 binaries for use on Apple Silicon (aka M1/M2/M3) systemsJanuary 14, 2024
R-universe now builds WASM binaries for all R packagesNovember 17, 2023
The rOpenSci MultiverseNovember 6, 2023
CRAN-ial Expansion: Taking Your R Package Development to New Frontiers with R-UniverseSeptember 19, 2023
Meeting the Stars of the R-Universe: The R-Universe Against Diseases.September 15, 2023
My Life with the R-universeAugust 1, 2023
New cran.dev shortlinks to package information and documentationJuly 26, 2023
Meeting the Stars of the R-Universe: PEcAn, an Open Source Project to Take Care of the PlanetJune 6, 2023
Downloading snapshots and creating stable R packages repositories using r-universeMay 31, 2023
How r-universe searches for packages on CRAN / BioconductorApril 3, 2023
Meeting the Stars of the R-Universe: Researching Our Brain with the Magic of the R-UniverseMarch 30, 2023
Meeting the Stars of the R-universe: ThinkR's Approach to Contributing to a Growing and Friendly R CommunityFebruary 28, 2023
Discovering and learning everything there is to know about R packages using r-universeFebruary 27, 2023
New preferred repo name for r-universe registriesFebruary 7, 2023
Improved permanent URL schema for r-universe.devJanuary 30, 2023
postdoc 1.0: minimal and uncluttered HTML package manualsNovember 29, 2022
Meeting the stars of the R-universe: R Community, Exchange and LearnNovember 23, 2022
Searching and browsing the R universeMarch 23, 2022
A Blend of Package Build FailuresJanuary 31, 2022
How renv restores packages from r-universe for reproducibility or productionJanuary 6, 2022
RSS feeds of package updates in r-universeNovember 24, 2021
How I Test cffr on (about) 2,000 Packages using GitHub Actions and R-universeNovember 23, 2021
Generating and customizing badges in r-universeOctober 14, 2021
rOpenSci docs are now built on r-universeSeptember 3, 2021
How to create your personal CRAN-like repository on R-universeJune 22, 2021
Publishing and browsing articles on R-universeApril 9, 2021
rOpenSci's R-universe ProjectMay 25, 2021
A first look at the R-universe build infrastructureMarch 4, 2021
Moving away from Travis CINovember 19, 2020
How to precompute package vignettes or pkgdown articlesDecember 8, 2019

Showing 52 of total 52 results (show query)

bioc

GenomicRanges:Representation and manipulation of genomic intervals

The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.

Maintained by Hervé Pagès. Last updated 4 months ago.

genetics infrastructure datarepresentation sequencing annotation genomeannotation coverage bioconductor-package core-package

44 stars 17.68 score 13k scripts 1.3k dependents

bioc

SummarizedExperiment:A container (S4 class) for matrix-like assays

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Maintained by Hervé Pagès. Last updated 5 months ago.

genetics infrastructure sequencing annotation coverage genomeannotation bioconductor-package core-package

34 stars 16.84 score 8.6k scripts 1.2k dependents

bioc

GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style

Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.

Maintained by Hervé Pagès. Last updated 2 months ago.

genetics datarepresentation annotation genomeannotation bioconductor-package core-package

32 stars 16.32 score 1.3k scripts 1.7k dependents

bioc

DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets

Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation annotation genomeannotation bioconductor-package core-package u24ca289073

27 stars 15.59 score 538 scripts 1.2k dependents

bioc

GenomicFeatures:Query the gene models of a given organism/assembly

Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.

Maintained by H. Pagès. Last updated 5 months ago.

genetics infrastructure annotation sequencing genomeannotation bioconductor-package core-package

26 stars 15.34 score 5.3k scripts 339 dependents

bioc

AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor

Implements a user-friendly interface for querying SQLite-based annotation data packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation microarray sequencing genomeannotation bioconductor-package core-package

9 stars 15.05 score 3.6k scripts 769 dependents

bioc

HDF5Array:HDF5 datasets as array-like objects in R

The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.

Maintained by Hervé Pagès. Last updated 10 days ago.

infrastructure datarepresentation dataimport sequencing rnaseq coverage annotation genomeannotation singlecell immunooncology bioconductor-package core-package u24ca289073

12 stars 13.20 score 844 scripts 126 dependents

bioc

EnrichedHeatmap:Making Enriched Heatmaps

Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions. Here we implement enriched heatmap by ComplexHeatmap package. Since this type of heatmap is just a normal heatmap but with some special settings, with the functionality of ComplexHeatmap, it would be much easier to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.

Maintained by Zuguang Gu. Last updated 5 months ago.

software visualization sequencing genomeannotation coverage cpp

190 stars 10.87 score 330 scripts 1 dependents

bioc

tximeta:Transcript Quantification Import with Automatic Metadata

Transcript quantification import from Salmon and other quantifiers with automatic attachment of transcript ranges and release information, and other associated metadata. De novo transcriptomes can be linked to the appropriate sources with linkedTxomes and shared for computational reproducibility.

Maintained by Michael Love. Last updated 2 months ago.

annotation genomeannotation dataimport preprocessing rnaseq singlecell transcriptomics transcription geneexpression functionalgenomics reproducibleresearch reportwriting immunooncology

67 stars 10.58 score 466 scripts 1 dependents

bioc

plotgardener:Coordinate-Based Genomic Visualization Package for R

Coordinate-based genomic visualization package for R. It grants users the ability to programmatically produce complex, multi-paneled figures. Tailored for genomics, plotgardener allows users to visualize large complex genomic datasets and provides exquisite control over how plots are placed and arranged on a page.

Maintained by Nicole Kramer. Last updated 5 months ago.

visualization genomeannotation functionalgenomics genomeassembly hic cpp

309 stars 10.17 score 167 scripts 3 dependents

bioc

QDNAseq:Quantitative DNA Sequencing for Chromosomal Aberrations

Quantitative DNA sequencing for chromosomal aberrations. The genome is divided into non-overlapping fixed-sized bins, number of sequence reads in each counted, adjusted with a simultaneous two-dimensional loess correction for sequence mappability and GC content, and filtered to remove spurious regions in the genome. Downstream steps of segmentation and calling are also implemented via packages DNAcopy and CGHcall, respectively.

Maintained by Daoud Sie. Last updated 5 months ago.

copynumbervariation dnaseq genetics genomeannotation preprocessing qualitycontrol sequencing

49 stars 10.10 score 177 scripts 4 dependents

bioc

UCSC.utils:Low-level utilities to retrieve data from the UCSC Genome Browser

A set of low-level utilities to retrieve data from the UCSC Genome Browser. Most functions in the package access the data via the UCSC REST API but some of them query the UCSC MySQL server directly. Note that the primary purpose of the package is to support higher-level functionalities implemented in downstream packages like GenomeInfoDb or txdbmaker.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure genomeassembly annotation genomeannotation dataimport bioconductor-package core-package

1 stars 10.09 score 4 scripts 1.7k dependents

bioc

rGREAT:GREAT Analysis - Functional Enrichment on Genomic Regions

GREAT (Genomic Regions Enrichment of Annotations Tool) is a type of functional enrichment analysis directly performed on genomic regions. This package implements the GREAT algorithm (the local GREAT analysis), also it supports directly interacting with the GREAT web service (the online GREAT analysis). Both analysis can be viewed by a Shiny application. rGREAT by default supports more than 600 organisms and a large number of gene set collections, as well as self-provided gene sets and organisms from users. Additionally, it implements a general method for dealing with background regions.

Maintained by Zuguang Gu. Last updated 16 days ago.

genesetenrichment go pathways software sequencing wholegenome genomeannotation coverage cpp

86 stars 9.96 score 320 scripts 1 dependents

bioc

annotatr:Annotation of Genomic Regions to Genomic Annotations

Given a set of genomic sites/regions (e.g. ChIP-seq peaks, CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often of interest to investigate the intersecting genomic annotations. Such annotations include those relating to gene models (promoters, 5'UTRs, exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG shelves), or regulatory sequences such as enhancers. The annotatr package provides an easy way to summarize and visualize the intersection of genomic sites/regions with genomic annotations.

Maintained by Raymond G. Cavalcante. Last updated 5 months ago.

software annotation genomeannotation functionalgenomics visualization genome-annotation

26 stars 9.76 score 246 scripts 5 dependents

bioc

txdbmaker:Tools for making TxDb objects from genomic annotations

A set of tools for making TxDb objects from genomic annotations from various sources (e.g. UCSC, Ensembl, and GFF files). These tools allow the user to download the genomic locations of transcripts, exons, and CDS, for a given assembly, and to import them in a TxDb object. TxDb objects are implemented in the GenomicFeatures package, together with flexible methods for extracting the desired features in convenient formats.

Maintained by H. Pagès. Last updated 4 months ago.

infrastructure dataimport annotation genomeannotation genomeassembly genetics sequencing bioconductor-package core-package

3 stars 9.68 score 92 scripts 87 dependents

bioc

LOLA:Locus overlap analysis for enrichment of genomic ranges

Provides functions for testing overlap of sets of genomic regions with public and custom region set (genomic ranges) databases. This makes it possible to do automated enrichment analysis for genomic region sets, thus facilitating interpretation of functional genomics and epigenomics data.

Maintained by Nathan Sheffield. Last updated 5 months ago.

genesetenrichment generegulation genomeannotation systemsbiology functionalgenomics chipseq methylseq sequencing

76 stars 9.34 score 160 scripts

bioc

Rsubread:Mapping, quantification and variant analysis of sequencing data

Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.

Maintained by Wei Shi. Last updated 9 days ago.

sequencing alignment sequencematching rnaseq chipseq singlecell geneexpression generegulation genetics immunooncology snp geneticvariability preprocessing qualitycontrol genomeannotation genefusiondetection indeldetection variantannotation variantdetection multiplesequencealignment zlib

9.24 score 892 scripts 10 dependents

bioc

bambu:Context-Aware Transcript Quantification from Long Read RNA-Seq data

bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.

Maintained by Ying Chen. Last updated 2 months ago.

alignment coverage differentialexpression featureextraction geneexpression genomeannotation genomeassembly immunooncology longread multiplecomparison normalization rnaseq regression sequencing software transcription transcriptomics bambu bioconductor long-reads nanopore nanopore-sequencing rna-seq rna-seq-analysis transcript-quantification transcript-reconstruction cpp

203 stars 9.04 score 91 scripts 1 dependents

bioc

scPipe:Pipeline for single cell multi-omic data pre-processing

A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.

Maintained by Shian Su. Last updated 3 months ago.

immunooncology software sequencing rnaseq geneexpression singlecell visualization sequencematching preprocessing qualitycontrol genomeannotation dataimport curl bzip2 xz-utils zlib cpp

68 stars 9.02 score 84 scripts

bioc

nullranges:Generation of null ranges via bootstrapping or covariate matching

Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.

Maintained by Michael Love. Last updated 5 months ago.

visualization genesetenrichment functionalgenomics epigenetics generegulation genetarget genomeannotation annotation genomewideassociation histonemodification chipseq atacseq dnaseseq rnaseq hiddenmarkovmodel bioconductor bootstrap genomics matching statistics

27 stars 8.16 score 50 scripts 1 dependents

bioc

MIRA:Methylation-Based Inference of Regulatory Activity

DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for a given region set with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for the region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of region sets to be leveraged for novel insight into the regulatory state of DNA methylation datasets.

Maintained by John Lawson. Last updated 5 months ago.

immunooncology dnamethylation generegulation genomeannotation systemsbiology functionalgenomics chipseq methylseq sequencing epigenetics coverage

12 stars 7.56 score 7 scripts 1 dependents

bioc

HilbertCurve:Making 2D Hilbert Curve

Hilbert curve is a type of space-filling curves that fold one dimensional axis into a two dimensional space, but with still preserves the locality. This package aims to provide an easy and flexible way to visualize data through Hilbert curve.

Maintained by Zuguang Gu. Last updated 5 months ago.

software visualization sequencing coverage genomeannotation cpp

42 stars 7.45 score 48 scripts

bioc

GenomicDistributions:GenomicDistributions: fast analysis of genomic intervals with Bioconductor

If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.

Maintained by Kristyna Kupkova. Last updated 5 months ago.

software genomeannotation genomeassembly datarepresentation sequencing coverage functionalgenomics visualization

26 stars 7.44 score 25 scripts

bioc

COCOA:Coordinate Covariation Analysis

COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.

Maintained by John Lawson. Last updated 5 months ago.

epigenetics dnamethylation atacseq dnaseseq methylseq methylationarray principalcomponent genomicvariation generegulation genomeannotation systemsbiology functionalgenomics chipseq sequencing immunooncology dna-methylation pca

10 stars 7.02 score 21 scripts

bioc

Organism.dplyr:dplyr-based Access to Bioconductor Annotation Resources

This package provides an alternative interface to Bioconductor 'annotation' resources, in particular the gene identifier mapping functionality of the 'org' packages (e.g., org.Hs.eg.db) and the genome coordinate functionality of the 'TxDb' packages (e.g., TxDb.Hsapiens.UCSC.hg38.knownGene).

Maintained by Martin Morgan. Last updated 8 days ago.

annotation sequencing genomeannotation bioconductor-package core-package

3 stars 6.90 score 63 scripts 1 dependents

bioc

RCAS:RNA Centric Annotation System

RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.

Maintained by Bora Uyar. Last updated 5 months ago.

software genetarget motifannotation motifdiscovery go transcriptomics genomeannotation genesetenrichment coverage

6.32 score 29 scripts 1 dependents

bioc

plyxp:Data masks for SummarizedExperiment enabling dplyr-like manipulation

The package provides `rlang` data masks for the SummarizedExperiment class. The enables the evaluation of unquoted expression in different contexts of the SummarizedExperiment object with optional access to other contexts. The goal for `plyxp` is for evaluation to feel like a data.frame object without ever needing to unwind to a rectangular data.frame.

Maintained by Justin Landis. Last updated 12 days ago.

annotation genomeannotation transcriptomics

4 stars 5.88 score 6 scripts

bioc

atSNP:Affinity test for identifying regulatory SNPs

atSNP performs affinity tests of motif matches with the SNP or the reference genomes and SNP-led changes in motif matches.

Maintained by Sunyoung Shin. Last updated 5 months ago.

software chipseq genomeannotation motifannotation visualization cpp

1 stars 5.73 score 36 scripts

bioc

PICB:piRNA Cluster Builder

piRNAs (short for PIWI-interacting RNAs) and their PIWI protein partners play a key role in fertility and maintaining genome integrity by restricting mobile genetic elements (transposons) in germ cells. piRNAs originate from genomic regions known as piRNA clusters. The piRNA Cluster Builder (PICB) is a versatile toolkit designed to identify genomic regions with a high density of piRNAs. It constructs piRNA clusters through a stepwise integration of unique and multimapping piRNAs and offers wide-ranging parameter settings, supported by an optimization function that allows users to test different parameter combinations to tailor the analysis to their specific piRNA system. The output includes extensive metadata columns, enabling researchers to rank clusters and extract cluster characteristics.

Maintained by Franziska Ahrend. Last updated 2 months ago.

genetics genomeannotation sequencing functionalprediction coverage transcriptomics

5 stars 5.57 score

bioc

ASpli:Analysis of Alternative Splicing Using RNA-Seq

Integrative pipeline for the analysis of alternative splicing using RNAseq.

Maintained by Ariel Chernomoretz. Last updated 5 months ago.

immunooncology geneexpression transcription alternativesplicing coverage differentialexpression differentialsplicing timecourse rnaseq genomeannotation sequencing alignment

5.21 score 45 scripts 1 dependents

bioc

MEDIPS:DNA IP-seq data analysis

MEDIPS was developed for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, MEDIPS provides functionalities for the analysis of any kind of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential coverage between groups of samples and saturation and correlation analysis.

Maintained by Lukas Chavez. Last updated 5 months ago.

dnamethylation cpgisland differentialexpression sequencing chipseq preprocessing qualitycontrol visualization microarray genetics coverage genomeannotation copynumbervariation sequencematching

5.17 score 74 scripts

bioc

VariantExperiment:A RangedSummarizedExperiment Container for VCF/GDS Data with GDS Backend

VariantExperiment is a Bioconductor package for saving data in VCF/GDS format into RangedSummarizedExperiment object. The high-throughput genetic/genomic data are saved in GDSArray objects. The annotation data for features/samples are saved in DelayedDataFrame format with mono-dimensional GDSArray in each column. The on-disk representation of both assay data and annotation data achieves on-disk reading and processing and saves memory space significantly. The interface of RangedSummarizedExperiment data format enables easy and common manipulations for high-throughput genetic/genomic data with common SummarizedExperiment metaphor in R and Bioconductor.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation sequencing annotation genomeannotation genotypingarray

1 stars 5.00 score 2 scripts

bioc

GreyListChIP:Grey Lists -- Mask Artefact Regions Based on ChIP Inputs

Identify regions of ChIP experiments with high signal in the input, that lead to spurious peaks during peak calling. Remove reads aligning to these regions prior to peak calling, for cleaner ChIP analysis.

Maintained by Matt Eldridge. Last updated 5 months ago.

chipseq alignment preprocessing differentialpeakcalling sequencing genomeannotation coverage

4.93 score 10 scripts 4 dependents

bioc

BSgenomeForge:Forge your own BSgenome data package

A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure datarepresentation genomeassembly annotation genomeannotation sequencing alignment dataimport sequencematching bioconductor-package core-package

4 stars 4.90 score 6 scripts

bioc

branchpointer:Prediction of intronic splicing branchpoints

Predicts branchpoint probability for sites in intronic branchpoint windows. Queries can be supplied as intronic regions; or to evaluate the effects of mutations, SNPs.

Maintained by Beth Signal. Last updated 5 months ago.

software genomeannotation genomicvariation motifannotation

4.62 score 21 scripts

bioc

NoRCE:NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment

While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast.

Maintained by Gulden Olgun. Last updated 5 months ago.

biologicalquestion differentialexpression genomeannotation genesetenrichment genetarget genomeassembly go

1 stars 4.60 score 6 scripts

bioc

GrafGen:Classification of Helicobacter Pylori Genomes

To classify Helicobacter pylori genomes according to genetic distance from nine reference populations. The nine reference populations are hpgpAfrica, hpgpAfrica-distant, hpgpAfroamerica, hpgpEuroamerica, hpgpMediterranea, hpgpEurope, hpgpEurasia, hpgpAsia, and hpgpAklavik86-like. The vertex populations are Africa, Europe and Asia.

Maintained by William Wheeler. Last updated 2 months ago.

genetics software genomeannotation classification cpp

4.48 score 2 scripts

bioc

BREW3R.r:R package associated to BREW3R

This R package provide functions that are used in the BREW3R workflow. This mainly contains a function that extend a gtf as GRanges using information from another gtf (also as GRanges). The process allows to extend gene annotation without increasing the overlap between gene ids.

Maintained by Lucille Lopez-Delisle. Last updated 5 months ago.

genomeannotation

4.30 score 6 scripts

bioc

compEpiTools:Tools for computational epigenomics

Tools for computational epigenomics developed for the analysis, integration and simultaneous visualization of various (epi)genomics data types across multiple genomic regions in multiple samples.

Maintained by Mattia Furlan. Last updated 5 months ago.

geneexpression sequencing visualization genomeannotation coverage

4.30 score 6 scripts

rfael0cm

RTIGER:HMM-Based Model for Genotyping and Cross-Over Identification

Our method integrates information from all sequenced samples, thus avoiding loss of alleles due to low coverage. Moreover, it increases the statistical power to uncover sequencing or alignment errors <doi:10.1093/plphys/kiad191>.

Maintained by Rafael Campos-Martin. Last updated 1 years ago.

genomeannotation hiddenmarkovmodel sequencing

4 stars 4.30 score 5 scripts

bioc

SMITE:Significance-based Modules Integrating the Transcriptome and Epigenome

This package builds on the Epimods framework which facilitates finding weighted subnetworks ("modules") on Illumina Infinium 27k arrays using the SpinGlass algorithm, as implemented in the iGraph package. We have created a class of gene centric annotations associated with p-values and effect sizes and scores from any researchers prior statistical results to find functional modules.

Maintained by Neil Ari Wijetunga. Last updated 5 months ago.

immunooncology differentialmethylation differentialexpression systemsbiology networkenrichment genomeannotation network sequencing rnaseq coverage

1 stars 4.26 score 13 scripts

bioc

HelloRanges:Introduce *Ranges to bedtools users

Translates bedtools command-line invocations to R code calling functions from the Bioconductor *Ranges infrastructure. This is intended to educate novice Bioconductor users and to compare the syntax and semantics of the two frameworks.

Maintained by Michael Lawrence. Last updated 5 months ago.

sequencing annotation coverage genomeannotation dataimport sequencematching variantannotation

4.19 score 26 scripts 1 dependents

bioc

AssessORF:Assess Gene Predictions Using Proteomics and Evolutionary Conservation

In order to assess the quality of a set of predicted genes for a genome, evidence must first be mapped to that genome. Next, each gene must be categorized based on how strong the evidence is for or against that gene. The AssessORF package provides the functions and class structures necessary for accomplishing those tasks, using proteomic hits and evolutionarily conserved start codons as the forms of evidence.

Maintained by Deepank Korandla. Last updated 5 months ago.

comparativegenomics geneprediction genomeannotation genetics proteomics qualitycontrol visualization

4.18 score 3 scripts

bioc

pram:Pooling RNA-seq datasets for assembling transcript models

Publicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by large collections of RNA-seq datasets has emerged as one of such analysis. To increase the power of transcript discovery from large collections of RNA-seq datasets, we developed a new R package named Pooling RNA-seq and Assembling Models (PRAM), which builds transcript models in intergenic regions from pooled RNA-seq datasets. This package includes functions for defining intergenic regions, extracting and pooling related RNA-seq alignments, predicting, selected, and evaluating transcript models.

Maintained by Peng Liu. Last updated 5 months ago.

software technology sequencing rnaseq biologicalquestion geneprediction genomeannotation researchfield transcriptomics bioconductor-package genome-annotation rna-seq transcript-model

1 stars 4.18 score 3 scripts

bioc

planttfhunter:Identification and classification of plant transcription factors

planttfhunter is used to identify plant transcription factors (TFs) from protein sequence data and classify them into families and subfamilies using the classification scheme implemented in PlantTFDB. TFs are identified using pre-built hidden Markov model profiles for DNA-binding domains. Then, auxiliary and forbidden domains are used with DNA-binding domains to classify TFs into families and subfamilies (when applicable). Currently, TFs can be classified in 58 different TF families/subfamilies.

Maintained by Fabrício Almeida-Silva. Last updated 5 months ago.

software transcription functionalprediction genomeannotation functionalgenomics hiddenmarkovmodel sequencing classification functional-genomics gene-families hidden-markov-models plant-genomics plants protein-domains transcription-factors

4.00 score 5 scripts

bioc

geneXtendeR:Optimized Functional Annotation Of ChIP-seq Data

geneXtendeR optimizes the functional annotation of ChIP-seq peaks by exploring relative differences in annotating ChIP-seq peak sets to variable-length gene bodies. In contrast to prior techniques, geneXtendeR considers peak annotations beyond just the closest gene, allowing users to see peak summary statistics for the first-closest gene, second-closest gene, ..., n-closest gene whilst ranking the output according to biologically relevant events and iteratively comparing the fidelity of peak-to-gene overlap across a user-defined range of upstream and downstream extensions on the original boundaries of each gene's coordinates. Since different ChIP-seq peak callers produce different differentially enriched peaks with a large variance in peak length distribution and total peak count, annotating peak lists with their nearest genes can often be a noisy process. As such, the goal of geneXtendeR is to robustly link differentially enriched peaks with their respective genes, thereby aiding experimental follow-up and validation in designing primers for a set of prospective gene candidates during qPCR.

Maintained by Bohdan Khomtchouk. Last updated 5 months ago.

chipseq genetics annotation genomeannotation differentialpeakcalling coverage peakdetection chiponchip histonemodification dataimport naturallanguageprocessing visualization go software bioconductor bioinformatics c chip-seq computational-biology epigenetics functional-annotation

9 stars 3.95 score 5 scripts

bioc

microRNA:Data and functions for dealing with microRNAs

Different data resources for microRNAs and some functions for manipulating them.

Maintained by "Michael Lawrence". Last updated 2 months ago.

infrastructure genomeannotation sequencematching cpp

3.48 score 7 scripts

bioc

mCSEA:Methylated CpGs Set Enrichment Analysis

Identification of diferentially methylated regions (DMRs) in predefined regions (promoters, CpG islands...) from the human genome using Illumina's 450K or EPIC microarray data. Provides methods to rank CpG probes based on linear models and includes plotting functions.

Maintained by Jordi Martorell-Marugán. Last updated 4 months ago.

immunooncology differentialmethylation dnamethylation epigenetics genetics genomeannotation methylationarray microarray multiplecomparison twochannel

3.38 score 15 scripts

bioc

IWTomics:Interval-Wise Testing for Omics Data

Implementation of the Interval-Wise Testing (IWT) for omics data. This inferential procedure tests for differences in "Omics" data between two groups of genomic regions (or between a group of genomic regions and a reference center of symmetry), and does not require fixing location and scale at the outset.

Maintained by Marzia A Cremona. Last updated 5 months ago.

statisticalmethod multiplecomparison differentialexpression differentialmethylation differentialpeakcalling genomeannotation dataimport

3.30 score 5 scripts

bioc

garfield:GWAS Analysis of Regulatory or Functional Information Enrichment with LD correction

GARFIELD is a non-parametric functional enrichment analysis approach described in the paper GARFIELD: GWAS analysis of regulatory or functional information enrichment with LD correction. Briefly, it is a method that leverages GWAS findings with regulatory or functional annotations (primarily from ENCODE and Roadmap epigenomics data) to find features relevant to a phenotype of interest. It performs greedy pruning of GWAS SNPs (LD r2 > 0.1) and then annotates them based on functional information overlap. Next, it quantifies Fold Enrichment (FE) at various GWAS significance cutoffs and assesses them by permutation testing, while matching for minor allele frequency, distance to nearest transcription start site and number of LD proxies (r2 > 0.8).

Maintained by Valentina Iotchkova. Last updated 5 months ago.

software statisticalmethod annotation functionalprediction genomeannotation cpp

3.30 score 6 scripts

bioc

fcScan:fcScan for detecting clusters of coordinates with user defined options

This package is used to detect combination of genomic coordinates falling within a user defined window size along with user defined overlap between identified neighboring clusters. It can be used for genomic data where the clusters are built on a specific chromosome or specific strand. Clustering can be performed with a "greedy" option allowing thus the presence of additional sites within the allowed window size.

Maintained by Pierre Khoueiry. Last updated 5 months ago.

genomeannotation clustering

3.30 score 1 scripts

cran

geno2proteo:Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci

Using the DNA sequence and gene annotation files provided in 'ENSEMBL' <https://www.ensembl.org/index.html>, the functions implemented in the package try to find the DNA sequences and protein sequences of any given genomic loci, and to find the genomic coordinates and protein sequences of any given protein locations, which are the frequent tasks in the analysis of genomic and proteomic data.

Maintained by Yaoyong Li. Last updated 3 years ago.

genetics proteomics sequencing annotation genomeannotation genomeassembly

2.00 score