Showing 125 of total 125 results (show query)
ropensci
biomartr:Genomic Data Retrieval
Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.
Maintained by Hajk-Georg Drost. Last updated 1 months ago.
biomartgenomic-data-retrievalannotation-retrievaldatabase-retrievalncbiensemblbiological-data-retrievalensembl-serversgenomegenome-annotationgenome-retrievalgenomicsmeta-analysismetagenomicsncbi-genbankpeer-reviewedproteomesequenced-genomes
71.6 match 218 stars 11.35 score 129 scripts 3 dependentsbioc
ginmappeR:Gene Identifier Mapper
Provides functionalities to translate gene or protein identifiers between state-of-art biological databases: CARD (<https://card.mcmaster.ca/>), NCBI Protein, Nucleotide and Gene (<https://www.ncbi.nlm.nih.gov/>), UniProt (<https://www.uniprot.org/>) and KEGG (<https://www.kegg.jp>). Also offers complementary functionality like NCBI identical proteins or UniProt similar genes clusters retrieval.
Maintained by Fernando Sola. Last updated 3 months ago.
annotationkegggeneticsthirdpartyclientsoftware
54.5 match 4.88 score 7 scriptsropensci
rentrez:'Entrez' in R
Provides an R interface to the NCBI's 'EUtils' API, allowing users to search databases like 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and 'PubMed' <https://pubmed.ncbi.nlm.nih.gov/>, process the results of those searches and pull data into their R sessions.
Maintained by David Winter. Last updated 4 years ago.
17.5 match 199 stars 13.60 score 784 scripts 95 dependentsbioc
GEOquery:Get data from NCBI Gene Expression Omnibus (GEO)
The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.
Maintained by Sean Davis. Last updated 5 months ago.
microarraydataimportonechanneltwochannelsagebioconductorbioinformaticsdata-sciencegenomicsncbi-geo
14.5 match 92 stars 14.46 score 4.1k scripts 44 dependentssherrillmix
taxonomizr:Functions to Work with NCBI Accessions and Taxonomy
Functions for assigning taxonomy to NCBI accession numbers and taxon IDs based on NCBI's accession2taxid and taxdump files. This package allows the user to download NCBI data dumps and create a local database for fast and local taxonomic assignment.
Maintained by Scott Sherrill-Mix. Last updated 4 days ago.
21.1 match 72 stars 8.85 score 255 scripts 2 dependentsstitam
webseq:Access data from biological sequence databases like NCBI, ENA, MGnify
This package interacts with online biological sequence databases. It provides functions to search for sequences, convert identifiers and download sequences and associated metadata.
Maintained by Tamas Stirling. Last updated 1 months ago.
37.2 match 3 stars 4.13 score 1 scriptsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 6 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
19.3 match 33 stars 7.77 score 10 scriptsropensci
taxize:Taxonomic Information from Around the Web
Interacts with a suite of web application programming interfaces (API) for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more. Some of the services supported include 'NCBI E-utilities' (<https://www.ncbi.nlm.nih.gov/books/NBK25501/>), 'Encyclopedia of Life' (<https://eol.org/docs/what-is-eol/data-services>), 'Global Biodiversity Information Facility' (<https://techdocs.gbif.org/en/openapi/>), and many more. Links to the API documentation for other supported services are available in the documentation for their respective functions in this package.
Maintained by Zachary Foster. Last updated 12 days ago.
taxonomybiologynomenclaturejsonapiwebapi-clientidentifiersspeciesnamesapi-wrapperbiodiversitydarwincoredatataxize
9.8 match 274 stars 13.63 score 1.6k scripts 23 dependentsbioc
biodbNcbi:biodbNcbi, a library for connecting to NCBI Databases.
The biodbNcbi library provides access to the NCBI databases CCDS, Gene, Pubchem Comp and Pubchem Subst, using biodb package framework. It allows to retrieve entries by their accession number. Web services can be accessed for searching the database by name or mass.
Maintained by Pierrick Roger. Last updated 5 months ago.
softwareinfrastructuredataimport
25.5 match 1 stars 4.00 score 1 scriptsropensci
rsnps:Get 'SNP' ('Single-Nucleotide' 'Polymorphism') Data on the Web
A programmatic interface to various 'SNP' 'datasets' on the web: 'OpenSNP' (<https://opensnp.org>), and 'NBCIs' 'dbSNP' database (<https://www.ncbi.nlm.nih.gov/projects/SNP/>). Functions are included for searching for 'NCBI'. For 'OpenSNP', functions are included for getting 'SNPs', and data for 'genotypes', 'phenotypes', annotations, and bulk downloads of data by user.
Maintained by Julia Gustavsen. Last updated 2 years ago.
genesnpsequenceapiwebapi-clientspeciesdbsnpopensnpncbigenotypedatasnpsweb-api
14.8 match 52 stars 6.59 score 63 scriptsskoval
RISmed:Download Content from NCBI Databases
A set of tools to extract bibliographic content from the National Center for Biotechnology Information (NCBI) databases, including PubMed. The name RISmed is a portmanteau of RIS (for Research Information Systems, a common tag format for bibliographic data) and PubMed.
Maintained by Stephanie Kovalchik. Last updated 3 years ago.
12.7 match 38 stars 6.94 score 252 scripts 3 dependentslifemap-tol
LifemapR:Data Visualisation on 'Lifemap' Tree
Allow to visualise data on the NCBI phylogenetic tree as presented in Lifemap <https://lifemap.cnrs.fr/>. It takes as input a dataframe with at least a "taxid" column containing NCBI format TaxIds and allows to draw multiple layers with different visualisation tools.
Maintained by Aurรฉlie Siberchicot. Last updated 12 days ago.
13.0 match 7 stars 6.15 scorencbi-hackathons
geneHummus:A Pipeline to Define Gene Families in Legumes and Beyond
A pipeline with high specificity and sensitivity in extracting proteins from the RefSeq database (National Center for Biotechnology Information). Manual identification of gene families is highly time-consuming and laborious, requiring an iterative process of manual and computational analysis to identify members of a given family. The pipelines implements an automatic approach for the identification of gene families based on the conserved domains that specifically define that family. See Die et al. (2018) <doi:10.1101/436659> for more information and examples.
Maintained by Jose V. Die. Last updated 5 years ago.
18.6 match 8 stars 4.20 score 3 scriptsbioc
annotate:Annotation for microarrays
Using R enviroments for annotation.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
6.7 match 11.41 score 812 scripts 243 dependentsoganm
homologene:Quick Access to Homologene and Gene Annotation Updates
A wrapper for the homologene database by the National Center for Biotechnology Information ('NCBI'). It allows searching for gene homologs across species. Data in this package can be found at <ftp://ftp.ncbi.nih.gov/pub/HomoloGene/build68/>. The package also includes an updated version of the homologene database where gene identifiers and symbols are replaced with their latest (at the time of submission) version and functions to fetch latest annotation data to keep updated.
Maintained by Ogan Mancarci. Last updated 1 years ago.
bioinformaticshomologenemancarci-2017ncbi-taxonomyogan-biospecieswrapper
8.3 match 42 stars 7.87 score 164 scripts 4 dependentsropensci
traits:Species Trait Data from Around the Web
Species trait data from many different sources, including sequence data from 'NCBI' (<https://www.ncbi.nlm.nih.gov/>), plant trait data from 'BETYdb', data from 'EOL' 'Traitbank', 'Birdlife' International, and more.
Maintained by David LeBauer. Last updated 2 months ago.
traitsapiweb-servicesspeciestaxonomyapi-client
7.3 match 41 stars 8.65 score 82 scripts 11 dependentsbioc
GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style
Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.
Maintained by Hervรฉ Pagรจs. Last updated 2 months ago.
geneticsdatarepresentationannotationgenomeannotationbioconductor-packagecore-package
3.6 match 32 stars 16.46 score 1.3k scripts 1.7k dependentsfischuu
hoardeR:Collect and Retrieve Annotation Data for Various Genomic Data Using Different Webservices
Cross-species identification of novel gene candidates using the NCBI web service is provided. Further, sets of miRNA target genes can be identified by using the targetscan.org API.
Maintained by Daniel Fischer. Last updated 11 months ago.
15.4 match 1 stars 3.70 score 6 scriptsbioc
OmnipathR:OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Maintained by Denes Turei. Last updated 19 days ago.
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
5.8 match 126 stars 9.90 score 226 scripts 2 dependentsbioc
cogeqc:Systematic quality checks on comparative genomics analyses
cogeqc aims to facilitate systematic quality checks on standard comparative genomics analyses to help researchers detect issues and select the most suitable parameters for each data set. cogeqc can be used to asses: i. genome assembly and annotation quality with BUSCOs and comparisons of statistics with publicly available genomes on the NCBI; ii. orthogroup inference using a protein domain-based approach and; iii. synteny detection using synteny network properties. There are also data visualization functions to explore QC summary statistics.
Maintained by Fabrรญcio Almeida-Silva. Last updated 5 months ago.
softwaregenomeassemblycomparativegenomicsfunctionalgenomicsphylogeneticsqualitycontrolnetworkcomparative-genomicsevolutionary-genomics
9.1 match 10 stars 6.08 score 20 scriptsropensci
phylotaR:Automated Phylogenetic Sequence Cluster Identification from 'GenBank'
A pipeline for the identification, within taxonomic groups, of orthologous sequence clusters from 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> as the first step in a phylogenetic analysis. The pipeline depends on a local alignment search tool and is, therefore, not dependent on differences in gene naming conventions and naming errors.
Maintained by Shixiang Wang. Last updated 8 months ago.
blastngenbankpeer-reviewedphylogeneticssequence-alignment
9.2 match 23 stars 5.86 score 156 scriptsbioc
SRAdb:A compilation of metadata from NCBI SRA and tools
The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, and others. However, finding data of interest can be challenging using current tools. SRAdb is an attempt to make access to the metadata associated with submission, study, sample, experiment and run much more feasible. This is accomplished by parsing all the NCBI SRA metadata into a SQLite database that can be stored and queried locally. Fulltext search in the package make querying metadata very flexible and powerful. fastq and sra files can be downloaded for doing alignment locally. Beside ftp protocol, the SRAdb has funcitons supporting fastp protocol (ascp from Aspera Connect) for faster downloading large data files over long distance. The SQLite database is updated regularly as new data is added to SRA and can be downloaded at will for the most up-to-date metadata.
Maintained by Jack Zhu. Last updated 3 months ago.
infrastructuresequencingdataimport
6.9 match 2 stars 7.81 score 200 scriptsropensci
RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management
Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.
Maintained by Mathew W. McLean. Last updated 4 months ago.
3.9 match 115 stars 12.06 score 2.3k scripts 16 dependentssamuel-marsh
scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing
Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.
Maintained by Samuel Marsh. Last updated 3 months ago.
customizationggplot2scrna-seqseuratsingle-cellsingle-cell-genomicssingle-cell-rna-seqvisualization
5.2 match 242 stars 8.75 score 1.1k scriptsgschofl
reutils:Talk to the NCBI EUtils
An interface to NCBI databases such as PubMed, GenBank, or GEO powered by the Entrez Programming Utilities (EUtils). The nine EUtils provide programmatic access to the NCBI Entrez query and database system for searching and retrieving biological data.
Maintained by Gerhard Schรถfl. Last updated 4 years ago.
6.0 match 22 stars 6.95 score 135 scripts 1 dependentsmvesuviusc
primerTree:Visually Assessing the Specificity and Informativeness of Primer Pairs
Identifies potential target sequences for a given set of primers and generates phylogenetic trees annotated with the taxonomies of the predicted amplification products.
Maintained by Matt Cannon. Last updated 1 years ago.
6.9 match 51 stars 5.56 score 16 scriptsshaunpwilkinson
insect:Informatic Sequence Classification Trees
Provides tools for probabilistic taxon assignment with informatic sequence classification trees. See Wilkinson et al (2018) <doi:10.7287/peerj.preprints.26812v1>.
Maintained by Shaun Wilkinson. Last updated 4 years ago.
5.5 match 14 stars 5.80 score 91 scriptsropensci
restez:Create and Query a Local Copy of 'GenBank' in R
Download large sections of 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and generate a local SQL-based database. A user can then query this database using 'restez' functions or through 'rentrez' <https://CRAN.R-project.org/package=rentrez> wrappers.
Maintained by Joel H. Nitta. Last updated 10 days ago.
4.1 match 26 stars 7.01 score 175 scripts 1 dependentsbioc
clustifyr:Classifier for Single-cell RNA-seq Using Cell Clusters
Package designed to aid in classifying cells from single-cell RNA sequencing data using external reference data (e.g., bulk RNA-seq, scRNA-seq, microarray, gene lists). A variety of correlation based methods and gene list enrichment methods are provided to assist cell type assignment.
Maintained by Rui Fu. Last updated 5 months ago.
singlecellannotationsequencingmicroarraygeneexpressionassign-identitiesclustersmarker-genesrna-seqsingle-cell-rna-seq
2.9 match 119 stars 9.63 score 296 scriptsbioc
SomaticSignatures:Somatic Signatures
The SomaticSignatures package identifies mutational signatures of single nucleotide variants (SNVs). It provides a infrastructure related to the methodology described in Nik-Zainal (2012, Cell), with flexibility in the matrix decomposition algorithms.
Maintained by Julian Gehring. Last updated 5 months ago.
sequencingsomaticmutationvisualizationclusteringgenomicvariationstatisticalmethod
3.3 match 22 stars 6.85 score 54 scripts 1 dependentsbioc
iClusterPlus:Integrative clustering of multi-type genomic data
Integrative clustering of multiple genomic data using a joint latent variable model.
Maintained by Qianxing Mo. Last updated 4 months ago.
multi-omicsclusteringfortranopenblas
3.4 match 5.76 score 190 scriptsfkeck
bioseq:A Toolbox for Manipulating Biological Sequences
Classes and functions to work with biological sequences (DNA, RNA and amino acid sequences). Implements S3 infrastructure to work with biological sequences as described in Keck (2020) <doi:10.1111/2041-210X.13490>. Provides a collection of functions to perform biological conversion among classes (transcription, translation) and basic operations on sequences (detection, selection and replacement based on positions or patterns). The package also provides functions to import and export sequences from and to other package formats.
Maintained by Francois Keck. Last updated 3 years ago.
2.9 match 22 stars 6.72 score 80 scripts 1 dependentsbioc
VERSO:Viral Evolution ReconStructiOn (VERSO)
Mutations that rapidly accumulate in viral genomes during a pandemic can be used to track the evolution of the virus and, accordingly, unravel the viral infection network. To this extent, sequencing samples of the virus can be employed to estimate models from genomic epidemiology and may serve, for instance, to estimate the proportion of undetected infected people by uncovering cryptic transmissions, as well as to predict likely trends in the number of infected, hospitalized, dead and recovered people. VERSO is an algorithmic framework that processes variants profiles from viral samples to produce phylogenetic models of viral evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a log-likelihood function. VERSO includes two separate and subsequent steps; in this package we provide an R implementation of VERSO STEP 1.
Maintained by Davide Maspero. Last updated 1 days ago.
biomedicalinformaticssequencingsomaticmutation
3.2 match 7 stars 6.15 scoremassimoaria
pubmedR:Gathering Metadata About Publications, Grants, Clinical Trials from 'PubMed' Database
A set of tools to extract bibliographic content from 'PubMed' database using 'NCBI' REST API <https://www.ncbi.nlm.nih.gov/home/develop/api/>.
Maintained by Massimo Aria. Last updated 12 months ago.
2.5 match 38 stars 7.70 score 39 scripts 3 dependentsbioc
rGREAT:GREAT Analysis - Functional Enrichment on Genomic Regions
GREAT (Genomic Regions Enrichment of Annotations Tool) is a type of functional enrichment analysis directly performed on genomic regions. This package implements the GREAT algorithm (the local GREAT analysis), also it supports directly interacting with the GREAT web service (the online GREAT analysis). Both analysis can be viewed by a Shiny application. rGREAT by default supports more than 600 organisms and a large number of gene set collections, as well as self-provided gene sets and organisms from users. Additionally, it implements a general method for dealing with background regions.
Maintained by Zuguang Gu. Last updated 3 days ago.
genesetenrichmentgopathwayssoftwaresequencingwholegenomegenomeannotationcoveragecpp
1.9 match 86 stars 9.96 score 320 scripts 1 dependentsbioc
BSgenomeForge:Forge your own BSgenome data package
A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.
Maintained by Hervรฉ Pagรจs. Last updated 5 months ago.
infrastructuredatarepresentationgenomeassemblyannotationgenomeannotationsequencingalignmentdataimportsequencematchingbioconductor-packagecore-package
3.6 match 4 stars 4.90 score 6 scriptslukejharmon
geiger:Analysis of Evolutionary Diversification
Methods for fitting macroevolutionary models to phylogenetic trees Pennell (2014) <doi:10.1093/bioinformatics/btu181>.
Maintained by Luke Harmon. Last updated 2 years ago.
2.3 match 1 stars 7.84 score 2.3k scripts 28 dependentsohdsi
PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model
A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.
Maintained by Egill Fridgeirsson. Last updated 9 days ago.
1.6 match 190 stars 10.85 score 297 scriptsbioc
AnnotationForge:Tools for building SQLite-based annotation data packages
Provides code for generating Annotation packages and their databases. Packages produced are intended to be used with AnnotationDbi.
Maintained by Bioconductor Package Maintainer. Last updated 3 days ago.
annotationinfrastructurebioconductor-packagecore-package
1.8 match 5 stars 9.62 score 143 scripts 19 dependentsbioc
Rsubread:Mapping, quantification and variant analysis of sequencing data
Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.
Maintained by Wei Shi. Last updated 2 days ago.
sequencingalignmentsequencematchingrnaseqchipseqsinglecellgeneexpressiongeneregulationgeneticsimmunooncologysnpgeneticvariabilitypreprocessingqualitycontrolgenomeannotationgenefusiondetectionindeldetectionvariantannotationvariantdetectionmultiplesequencealignmentzlib
1.7 match 9.24 score 892 scripts 10 dependentsmsq-123
CovidMutations:Mutation Analysis and Assay Validation Toolkit for COVID-19 (Coronavirus Disease 2019)
A feasible framework for mutation analysis and reverse transcription polymerase chain reaction (RT-PCR) assay evaluation of COVID-19, including mutation profile visualization, statistics and mutation ratio of each assay. The mutation ratio is conducive to evaluating the coverage of RT-PCR assays in large-sized samples<doi:10.20944/preprints202004.0529.v1>.
Maintained by Shaoqian Ma. Last updated 5 years ago.
3.4 match 4 stars 4.30 score 6 scriptsbioboot
bio3d:Biological Structure Analysis
Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information.
Maintained by Barry Grant. Last updated 5 months ago.
1.7 match 5 stars 8.49 score 1.4k scripts 10 dependentsbioc
scTensor:Detection of cell-cell interaction from single-cell RNA-seq dataset by tensor decomposition
The algorithm is based on the non-negative tucker decomposition (NTD2) of nnTensor.
Maintained by Koki Tsuyuzaki. Last updated 5 months ago.
dimensionreductionsinglecellsoftwaregeneexpression
3.4 match 4.18 score 2 scriptsjedick
chem16S:Chemical Metrics for Microbial Communities
Combines taxonomic classifications of high-throughput 16S rRNA gene sequences with reference proteomes of archaeal and bacterial taxa to generate amino acid compositions of community reference proteomes. Calculates chemical metrics including carbon oxidation state ('Zc'), stoichiometric oxidation and hydration state ('nO2' and 'nH2O'), H/C, N/C, O/C, and S/C ratios, grand average of hydropathicity ('GRAVY'), isoelectric point ('pI'), protein length, and average molecular weight of amino acid residues. Uses precomputed reference proteomes for archaea and bacteria derived from the Genome Taxonomy Database ('GTDB'). Also includes reference proteomes derived from the NCBI Reference Sequence ('RefSeq') database and manual mapping from the 'RDP Classifier' training set to 'RefSeq' taxonomy as described by Dick and Tan (2023) <doi:10.1007/s00248-022-01988-9>. Processes taxonomic classifications in 'RDP Classifier' format or OTU tables in 'phyloseq-class' objects from the Bioconductor package 'phyloseq'.
Maintained by Jeffrey Dick. Last updated 7 days ago.
16s-rrnacarbon-oxidation-statechemical-metricsgenomic-adaptationmicrobial-communities
2.3 match 4 stars 5.92 score 8 scriptsmt1022
cubar:Codon Usage Bias Analysis
A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables.
Maintained by Hong Zhang. Last updated 3 months ago.
bioinformaticscodon-usagemachine-learningsequence-analysis
2.2 match 6 stars 5.82 score 8 scriptspendy05
vDiveR:Visualization of Viral Protein Sequence Diversity Dynamics
To ease the visualization of outputs from Diversity Motif Analyser ('DiMA'; <https://github.com/BVU-BILSAB/DiMA>). 'vDiveR' allows visualization of the diversity motifs (index and its variants โ major, minor and unique) for elucidation of the underlying inherent dynamics. Please refer <https://vdiver-manual.readthedocs.io/en/latest/> for more information.
Maintained by Pendy Tok. Last updated 1 months ago.
conservation-leveldiversityentropysequencesviralvisualization
3.4 match 3.78 score 4 scriptspieterprovoost
ghettoblaster:NCBI BLAST web client
NCBI BLAST web client.
Maintained by Pieter Provoost. Last updated 1 years ago.
7.5 match 1.70 scoremoosa-r
rbioapi:User-Friendly R Interface to Biologic Web Services' API
Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.
Maintained by Moosa Rezwani. Last updated 1 months ago.
api-clientbioinformaticsbiologyenrichmentenrichment-analysisenrichrjasparmieaaover-representation-analysispantherreactomestringuniprot
1.7 match 20 stars 7.60 score 55 scriptsbioc
mosdef:MOSt frequently used and useful Differential Expression Functions
This package provides functionality to run a number of tasks in the differential expression analysis workflow. This encompasses the most widely used steps, from running various enrichment analysis tools with a unified interface to creating plots and beautifying table components linking to external websites and databases. This streamlines the generation of comprehensive analysis reports.
Maintained by Federico Marini. Last updated 3 months ago.
geneexpressionsoftwaretranscriptiontranscriptomicsdifferentialexpressionvisualizationreportwritinggenesetenrichmentgo
2.0 match 6.12 score 4 dependentsasa12138
pcutils:Some Useful Functions for Statistics and Visualization
Offers a range of utilities and functions for everyday programming tasks. 1.Data Manipulation. Such as grouping and merging, column splitting, and character expansion. 2.File Handling. Read and convert files in popular formats. 3.Plotting Assistance. Helpful utilities for generating color palettes, validating color formats, and adding transparency. 4.Statistical Analysis. Includes functions for pairwise comparisons and multiple testing corrections, enabling perform statistical analyses with ease. 5.Graph Plotting, Provides efficient tools for creating doughnut plot and multi-layered doughnut plot; Venn diagrams, including traditional Venn diagrams, upset plots, and flower plots; Simplified functions for creating stacked bar plots, or a box plot with alphabets group for multiple comparison group.
Maintained by Chen Peng. Last updated 5 months ago.
1.7 match 22 stars 6.57 score 28 scripts 4 dependentsbioc
GeDi:Defining and visualizing the distances between different genesets
The package provides different distances measurements to calculate the difference between genesets. Based on these scores the genesets are clustered and visualized as graph. This is all presented in an interactive Shiny application for easy usage.
Maintained by Annekathrin Nedwed. Last updated 5 months ago.
guigenesetenrichmentsoftwaretranscriptionrnaseqvisualizationclusteringpathwaysreportwritinggokeggreactomeshinyapps
2.0 match 1 stars 5.52 score 22 scriptsbioc
MetMashR:Metabolite Mashing with R
A package to merge, filter sort, organise and otherwise mash together metabolite annotation tables. Metabolite annotations can be imported from multiple sources (software) and combined using workflow steps based on S4 class templates derived from the `struct` package. Other modular workflow steps such as filtering, merging, splitting, normalisation and rest-api queries are included.
Maintained by Gavin Rhys Lloyd. Last updated 5 months ago.
1.9 match 2 stars 5.81 score 5 scriptsshixiangwang
rsra:Query and Download SRA Files from NCBI
Query and download SRA files from NCBI with 'wget'.
Maintained by Shixiang Wang. Last updated 3 years ago.
5.4 match 2 stars 2.00 scorebioc
SingleCellAlleleExperiment:S4 Class for Single Cell Data with Allele and Functional Levels for Immune Genes
Defines a S4 class that is based on SingleCellExperiment. In addition to the usual gene layer the object can also store data for immune genes such as HLAs, Igs and KIRs at allele and functional level. The package is part of a workflow named single-cell ImmunoGenomic Diversity (scIGD), that firstly incorporates allele-aware quantification data for immune genes. This new data can then be used with the here implemented data structure and functionalities for further data handling and data analysis.
Maintained by Jonas Schuck. Last updated 2 months ago.
datarepresentationinfrastructuresinglecelltranscriptomicsgeneexpressiongeneticsimmunooncologydataimport
1.7 match 7 stars 6.30 score 12 scriptsmamc-dci
pubtatordb:Create and Query a Local 'PubTator' Database
'PubTator' <https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator/> is a National Center for Biotechnology Information (NCBI) tool that enhances the annotation of articles on PubMed <https://www.ncbi.nlm.nih.gov/pubmed/>. It makes it possible to rapidly identify potential relationships between genes or proteins using text mining techniques. In contrast, manually searching for and reading the annotated articles would be very time consuming. 'PubTator' offers both an online interface and a RESTful API, however, neither of these approaches are well suited for frequent, high-throughput analyses. The package 'pubtatordb' provides a set of functions that make it easy for the average R user to download 'PubTator' annotations, create, and then query a local version of the database.
Maintained by Zachary Colburn. Last updated 5 years ago.
2.7 match 3.90 score 16 scriptsbioc
KnowSeq:KnowSeq R/Bioc package: The Smart Transcriptomic Pipeline
KnowSeq proposes a novel methodology that comprises the most relevant steps in the Transcriptomic gene expression analysis. KnowSeq expects to serve as an integrative tool that allows to process and extract relevant biomarkers, as well as to assess them through a Machine Learning approaches. Finally, the last objective of KnowSeq is the biological knowledge extraction from the biomarkers (Gene Ontology enrichment, Pathway listing and Visualization and Evidences related to the addressed disease). Although the package allows analyzing all the data manually, the main strenght of KnowSeq is the possibilty of carrying out an automatic and intelligent HTML report that collect all the involved steps in one document. It is important to highligh that the pipeline is totally modular and flexible, hence it can be started from whichever of the different steps. KnowSeq expects to serve as a novel tool to help to the experts in the field to acquire robust knowledge and conclusions for the data and diseases to study.
Maintained by Daniel Castillo-Secilla. Last updated 5 months ago.
geneexpressiondifferentialexpressiongenesetenrichmentdataimportclassificationfeatureextractionsequencingrnaseqbatcheffectnormalizationpreprocessingqualitycontrolgeneticstranscriptomicsmicroarrayalignmentpathwayssystemsbiologygoimmunooncology
2.9 match 3.30 score 5 scriptsbioc
MungeSumstats:Standardise summary statistics from GWAS
The *MungeSumstats* package is designed to facilitate the standardisation of GWAS summary statistics. It reformats inputted summary statisitics to include SNP, CHR, BP and can look up these values if any are missing. It also pefrorms dozens of QC and filtering steps to ensure high data quality and minimise inter-study differences.
Maintained by Alan Murphy. Last updated 3 months ago.
snpwholegenomegeneticscomparativegenomicsgenomewideassociationgenomicvariationpreprocessing
1.6 match 1 stars 5.87 score 91 scriptscran
pubmed.mineR:Text Mining of PubMed Abstracts
Text mining of PubMed Abstracts (text and XML) from <https://pubmed.ncbi.nlm.nih.gov/>.
Maintained by S. Ramachandran. Last updated 6 months ago.
4.3 match 6 stars 2.08 scorebioc
packFinder:de novo Annotation of Pack-TYPE Transposable Elements
Algorithm and tools for in silico pack-TYPE transposon discovery. Filters a given genome for properties unique to DNA transposons and provides tools for the investigation of returned matches. Sequences are input in DNAString format, and ranges are returned as a dataframe (in the format returned by as.dataframe(GRanges)).
Maintained by Jack Gisby. Last updated 5 months ago.
geneticssequencematchingannotationbioinformaticstext-mining
1.8 match 7 stars 4.85 score 6 scriptsviralemergence
insectDisease:Ecological Database of the World's Insect Pathogens
David Onstad provided us with this insect disease database, sometimes referred to as the 'Ecological Database of the Worlds Insect Pathogens' or EDWIP. Files have been converted from 'SQL' to csv, and ported into 'R' for easy exploration and analysis. Thanks to the Macroecology of Infectious Disease Research Coordination Network (RCN) for funding and support. Data are also served online in a static format at <https://edwip.ecology.uga.edu/>.
Maintained by Tad Dallas. Last updated 2 months ago.
1.9 match 13 stars 4.41 score 2 scriptsbioc
ChIPXpress:ChIPXpress: enhanced transcription factor target gene identification from ChIP-seq and ChIP-chip data using publicly available gene expression profiles
ChIPXpress takes as input predicted TF bound genes from ChIPx data and uses a corresponding database of gene expression profiles downloaded from NCBI GEO to rank the TF bound targets in order of which gene is most likely to be functional TF target.
Maintained by George Wu. Last updated 5 months ago.
2.1 match 3.78 score 2 scriptsnanxstats
protr:Generating Various Numerical Representation Schemes for Protein Sequences
Comprehensive toolkit for generating various numerical features of protein sequences described in Xiao et al. (2015) <DOI:10.1093/bioinformatics/btv042>. For full functionality, the software 'ncbi-blast+' is needed, see <https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html> for more information.
Maintained by Nan Xiao. Last updated 6 months ago.
bioinformaticsfeature-engineeringfeature-extractionmachine-learningpeptidesprotein-sequencessequence-analysis
0.8 match 52 stars 10.02 score 173 scripts 3 dependentsbioc
GEOfastq:Downloads ENA Fastqs With GEO Accessions
GEOfastq is used to download fastq files from the European Nucleotide Archive (ENA) starting with an accession from the Gene Expression Omnibus (GEO). To do this, sample metadata is retrieved from GEO and the Sequence Read Archive (SRA). SRA run accessions are then used to construct FTP and aspera download links for fastq files generated by the ENA.
Maintained by Alex Pickering. Last updated 5 months ago.
rnaseqdataimportbioinformaticsfastqgene-expressiongeorna-seq
1.7 match 4 stars 4.60 score 6 scriptsmassimoaria
bibliometrix:Comprehensive Science Mapping Analysis
Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.
Maintained by Massimo Aria. Last updated 7 days ago.
bibliometric-analysisbibliometricscitationcitation-networkcitationsco-authorsco-occurenceco-word-analysiscorrespondence-analysiscouplingisi-webjournalmanuscriptquantitative-analysisscholarssciencescience-mappingscientificscientometricsscopus
0.5 match 545 stars 12.54 score 518 scripts 2 dependentsdami82
easyPubMed:Search and Retrieve Scientific Publication Records from PubMed
Query NCBI Entrez and retrieve PubMed records in XML or text format. Process PubMed records by extracting and aggregating data from selected fields. A large number of records can be easily downloaded via this simple-to-use interface to the NCBI PubMed API.
Maintained by Damiano Fantini. Last updated 1 years ago.
0.8 match 21 stars 7.83 score 178 scripts 4 dependentsbioc
ramr:Detection of Rare Aberrantly Methylated Regions in Array and NGS Data
ramr is an R package for detection of epimutations (i.e., infrequent aberrant DNA methylation events) in large data sets obtained by methylation profiling using array or high-throughput methylation sequencing. In addition, package provides functions to visualize found aberrantly methylated regions (AMRs), to generate sets of all possible regions to be used as reference sets for enrichment analysis, and to generate biologically relevant test data sets for performance evaluation of AMR/DMR search algorithms.
Maintained by Oleksii Nikolaienko. Last updated 10 days ago.
dnamethylationdifferentialmethylationepigeneticsmethylationarraymethylseqaberrant-methylationbioconductordna-methylationepimutationmethylation-microarraysnext-generation-sequencingcppopenmp
1.3 match 4.65 score 5 scriptsurodelan
LocaTT:Geographically-Conscious Taxonomic Assignment for Metabarcoding
A bioinformatics pipeline for performing taxonomic assignment of DNA metabarcoding sequence data while considering geographic location. A detailed tutorial is available at <https://urodelan.github.io/Local_Taxa_Tool_Tutorial/>. A manuscript describing these methods is in preparation.
Maintained by Kenen Goodwin. Last updated 12 months ago.
1.8 match 3.00 scorebioc
ctsGE:Clustering of Time Series Gene Expression data
Methodology for supervised clustering of potentially many predictor variables, such as genes etc., in time series datasets Provides functions that help the user assigning genes to predefined set of model profiles.
Maintained by Michal Sharabi-Schwager. Last updated 5 months ago.
immunooncologygeneexpressiontranscriptiondifferentialexpressiongenesetenrichmentgeneticsbayesianclusteringtimecoursesequencingrnaseq
1.3 match 1 stars 4.00 score 3 scriptscran
crosstalkr:Analysis of Graph-Structured Data with a Focus on Protein-Protein Interaction Networks
Provides a general toolkit for drug target identification. We include functionality to reduce large graphs to subgraphs and prioritize nodes. In addition to being optimized for use with generic graphs, we also provides support to analyze protein-protein interactions networks from online repositories. For more details on core method, refer to Weaver et al. (2021) <https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008755>.
Maintained by Davis Weaver. Last updated 10 months ago.
1.7 match 2.70 scorejdieramon
refseqR:Common Computational Operations Working with RefSeq Entries (GenBank)
Fetches NCBI data (RefSeq <https://www.ncbi.nlm.nih.gov/refseq/> database) and provides an environment to extract information at the level of gene, mRNA or protein accessions.
Maintained by Jose V. Die. Last updated 3 months ago.
0.8 match 4 stars 5.34 score 5 scriptsselcukorkmaz
PubChemR:Interface to the 'PubChem' Database for Chemical Data Retrieval
Provides an interface to the 'PubChem' database via the PUG REST <https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest> and PUG View <https://pubchem.ncbi.nlm.nih.gov/docs/pug-view> services. This package allows users to automatically access chemical and biological data from 'PubChem', including compounds, substances, assays, and various other data types. Functions are available to retrieve data in different formats, perform searches, and access detailed annotations.
Maintained by Selcuk Korkmaz. Last updated 6 months ago.
0.8 match 2 stars 5.62 score 23 scriptslefeup
BoSSA:A Bunch of Structure and Sequence Analysis
Reads and plots phylogenetic placements.
Maintained by Pierre Lefeuvre. Last updated 4 years ago.
1.2 match 3.35 score 15 scriptsphylotastic
rphylotastic:An R Interface to 'Phylotastic' Web Services
This wraps the 'Phylotastic' services APIs described on Web Services at <www.phylotastic.org>. The main use case is to return a phylogenetic tree for a set of species, but the services also include ways to extract species names from web pages, perform taxonomic name resolution, retrieve a list of all descendant species of a taxon, find images of a species, and more.
Maintained by Brian OMeara. Last updated 2 years ago.
1.8 match 2.28 score 19 scriptslwheinsberg
dbGaPCheckup:dbGaP Checkup
Contains functions that check for formatting of the Subject Phenotype data set and data dictionary as specified by the National Center for Biotechnology Information (NCBI) Database of Genotypes and Phenotypes (dbGaP) <https://www.ncbi.nlm.nih.gov/gap/docs/submissionguide/>.
Maintained by Lacey W. Heinsberg. Last updated 1 years ago.
0.8 match 4 stars 4.86 score 18 scriptsbioc
Path2PPI:Prediction of pathway-related protein-protein interaction networks
Package to predict protein-protein interaction (PPI) networks in target organisms for which only a view information about PPIs is available. Path2PPI predicts PPI networks based on sets of proteins which can belong to a certain pathway from well-established model organisms. It helps to combine and transfer information of a certain pathway or biological process from several reference organisms to one target organism. Path2PPI only depends on the sequence similarity of the involved proteins.
Maintained by Oliver Philipp. Last updated 5 months ago.
networkinferencesystemsbiologynetworkproteomicspathways
1.2 match 3.30 score 1 scriptscran
disprose:Discriminating Probes Selection
Set of tools for molecular probes selection and design of a microarray, e.g. the assessment of physical and chemical properties, blast performance, selection according to sensitivity and selectivity. Methods used in package are described in: Lorenz R., Stephan H.B., Hรถner zu Siederdissen C. et al. (2011) <doi:10.1186/1748-7188-6-26>; Camacho C., Coulouris G., Avagyan V. et al. (2009) <doi:10.1186/1471-2105-10-421>.
Maintained by Elena Filatova. Last updated 3 years ago.
3.8 match 1.00 scorejaytimm
puremoe:Pubmed Unified REtrieval for Multi-Output Exploration
Access a variety of 'PubMed' data through a single, user-friendly interface, including abstracts <https://pubmed.ncbi.nlm.nih.gov/>, bibliometrics from 'iCite' <https://icite.od.nih.gov/>, pubtations from 'PubTator3' <https://www.ncbi.nlm.nih.gov/research/pubtator3/>, and full-text records from 'PMC' <https://www.ncbi.nlm.nih.gov/pmc/>.
Maintained by Jason Timm. Last updated 1 months ago.
0.9 match 2 stars 3.95 scorexjsun1221
tinyarray:Expression Data Analysis and Visualization
The Gene Expression Omnibus (<https://www.ncbi.nlm.nih.gov/geo/>) and The Cancer Genome Atlas (<https://portal.gdc.cancer.gov/>) are widely used medical public databases. Our platform integrates routine analysis and visualization tools for expression data to provide concise and intuitive data analysis and presentation.
Maintained by Xiaojie Sun. Last updated 9 months ago.
0.5 match 91 stars 6.67 score 138 scriptsperson-c
easybio:Comprehensive Single-Cell Annotation and Transcriptomic Analysis Toolkit
Provides a comprehensive toolkit for single-cell annotation with the 'CellMarker2.0' database (see Xia Li, Peng Wang, Yunpeng Zhang (2023) <doi: 10.1093/nar/gkac947>). Streamlines biological label assignment in single-cell RNA-seq data and facilitates transcriptomic analysis, including preparation of TCGA<https://portal.gdc.cancer.gov/> and GEO<https://www.ncbi.nlm.nih.gov/geo/> datasets, differential expression analysis and visualization of enrichment analysis results. Additional utility functions support various bioinformatics workflows. See Wei Cui (2024) <doi: 10.1101/2024.09.14.609619> for more details.
Maintained by Wei Cui. Last updated 13 days ago.
limmageoqueryedgerfgseabioinformaticscellmarker2gsearna-seqsingle-cell
0.5 match 10 stars 6.62 score 35 scriptsbioc
ViSEAGO:ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity
The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontology (GO) analysis of complex experimental design with multiple comparisons of interest. It allows to study large-scale datasets together and visualize GO profiles to capture biological knowledge. The acronym stands for three major concepts of the analysis: Visualization, Semantic similarity and Enrichment Analysis of Gene Ontology. It provides access to the last current GO annotations, which are retrieved from one of NCBI EntrezGene, Ensembl or Uniprot databases for several species. Using available R packages and novel developments, ViSEAGO extends classical functional GO analysis to focus on functional coherence by aggregating closely related biological themes while studying multiple datasets at once. It provides both a synthetic and detailed view using interactive functionalities respecting the GO graph structure and ensuring functional coherence supplied by semantic similarity. ViSEAGO has been successfully applied on several datasets from different species with a variety of biological questions. Results can be easily shared between bioinformaticians and biologists, enhancing reporting capabilities while maintaining reproducibility.
Maintained by Aurelien Brionne. Last updated 2 months ago.
softwareannotationgogenesetenrichmentmultiplecomparisonclusteringvisualization
0.5 match 6.64 score 22 scriptslauren-hanna
Map2NCBI:Mapping Markers to the Nearest Genomic Feature
Allows the user to generate a list of features (gene, pseudo, RNA, CDS, and/or UTR) directly from NCBI database for any species with a current build available. Option to save downloaded and formatted files is available, and the user can prioritize the feature list based on type and assembly builds present in the current build used. The user can then use the list of features generated or provide a list to map a set of markers (designed for SNP markers with a single base pair position available) to the closest feature based on the map build. This function does require map positions of the markers to be provided and the positions should be based on the build being queried through NCBI.
Maintained by Lauren Hanna. Last updated 5 years ago.
2.5 match 1.30 score 6 scriptsbioc
recountmethylation:Access and analyze public DNA methylation array data compilations
Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.
Maintained by Sean K Maden. Last updated 5 months ago.
dnamethylationepigeneticsmicroarraymethylationarrayexperimenthub
0.5 match 9 stars 6.28 score 9 scriptsbioc
ReportingTools:Tools for making reports in various formats
The ReportingTools software package enables users to easily display reports of analysis results generated from sources such as microarray and sequencing data. The package allows users to create HTML pages that may be viewed on a web browser such as Safari, or in other formats readable by programs such as Excel. Users can generate tables with sortable and filterable columns, make and display plots, and link table entries to other data sources such as NCBI or larger plots within the HTML page. Using the package, users can also produce a table of contents page to link various reports together for a particular project that can be viewed in a web browser. For more examples, please visit our site: http:// research-pub.gene.com/ReportingTools.
Maintained by Jason A. Hackney. Last updated 5 months ago.
immunooncologysoftwarevisualizationmicroarrayrnaseqgodatarepresentationgenesetenrichment
0.5 match 6.23 score 93 scripts 1 dependentsjosedv82
matuR:Athlete Maturation and Biobanding
Identifying maturation stages across young athletes is paramount for talent identification. Furthermore, the concept of biobanding, or grouping of athletes based on their biological development, instead of their chronological age, has been widely researched. The goal of this package is to help professionals working in the field of strength & conditioning and talent ID obtain common maturation metrics and as well as to quickly visualize this information via several plotting options. For the methods behind the computed maturation metrics implemented in this package refer to Khamis, H. J., & Roche, A. F. (1994) <https://pubmed.ncbi.nlm.nih.gov/7936860/>, Mirwald, R.L et al., (2002) <https://pubmed.ncbi.nlm.nih.gov/11932580/> and Cumming, Sean P. et al., (2017) <doi:10.1519/SSC.0000000000000281>.
Maintained by Jose Fernandez. Last updated 3 months ago.
0.8 match 12 stars 3.78 score 1 scriptsbioc
GEOexplorer:GEOexplorer: a webserver for gene expression analysis and visualisation
GEOexplorer is a webserver and R/Bioconductor package and web application that enables users to perform gene expression analysis. The development of GEOexplorer was made possible because of the excellent code provided by GEO2R (https: //www.ncbi.nlm.nih.gov/geo/geo2r/).
Maintained by Guy Hunt. Last updated 5 months ago.
softwaregeneexpressionmrnamicroarraydifferentialexpressionmicroarraymicrornaarraytranscriptomicsrnaseq
0.5 match 5 stars 5.32 score 14 scriptsbhelsel
agcounts:Calculate 'ActiGraph' Counts from Accelerometer Data
Calculate 'ActiGraph' counts from the X, Y, and Z axes of a triaxial accelerometer. This work was inspired by Neishabouri et al. who published the article "Quantification of Acceleration as Activity Counts in 'ActiGraph' Wearables" on February 24, 2022. The link to the article (<https://pubmed.ncbi.nlm.nih.gov/35831446>) and 'python' implementation of this code (<https://github.com/actigraph/agcounts>).
Maintained by Brian C. Helsel. Last updated 9 months ago.
0.5 match 10 stars 5.10 score 10 scriptsmarcschwartz
BI:Blinding Assessment Indexes for Randomized, Controlled, Clinical Trials
Generate the James Blinding Index, as described in James et al (1996) <https://pubmed.ncbi.nlm.nih.gov/8841652/> and the Bang Blinding Index, as described in Bang et al (2004) <https://pubmed.ncbi.nlm.nih.gov/15020033/>. These are measures to assess whether or not satisfactory blinding has been maintained in a randomized, controlled, clinical trial. These can be generated for trial subjects, research coordinators and principal investigators, based upon standardized questionnaires that have been administered, to assess whether or not they can correctly guess to which treatment arm (e.g. placebo or treatment) subjects were assigned at randomization.
Maintained by Marc Schwartz. Last updated 2 years ago.
0.8 match 2 stars 3.34 score 11 scriptsdivadnojnarg
CaPO4Sim:A Virtual Patient Simulator in the Context of Calcium and Phosphate Homeostasis
Explore calcium (Ca) and phosphate (Pi) homeostasis with two novel 'Shiny' apps, building upon on a previously published mathematical model written in C, to ensure efficient computations. The underlying model is accessible here <https://pubmed.ncbi.nlm.nih.gov/28747359/)>. The first application explores the fundamentals of Ca-Pi homeostasis, while the second provides interactive case studies for in-depth exploration of the topic, thereby seeking to foster student engagement and an integrative understanding of Ca-Pi regulation.
Maintained by David Granjon. Last updated 2 months ago.
0.5 match 40 stars 4.92 score 14 scriptsprise6
aVirtualTwins:Adaptation of Virtual Twins Method from Jared Foster
Research of subgroups in random clinical trials with binary outcome and two treatments groups. This is an adaptation of the Jared Foster method (<https://www.ncbi.nlm.nih.gov/pubmed/21815180>).
Maintained by Francois Vieille. Last updated 7 years ago.
0.5 match 4 stars 4.51 score 16 scriptsc1au6i0
extractox:Extract Tox Info from Various Databases
Extract toxicological and chemical information from databases maintained by scientific agencies and resources, including the Comparative Toxicogenomics Database <https://ctdbase.org/>, the Integrated Chemical Environment <https://ice.ntp.niehs.nih.gov/>, the Integrated Risk Information System <https://cfpub.epa.gov/ncea/iris/>, Provisional Peer-Reviewed Toxicity Values <https://www.epa.gov/pprtv/provisional-peer-reviewed-toxicity-values-pprtvs-assessments>, the CompTox Chemicals Dashboard Resource Hub <https://www.epa.gov/comptox-tools/comptox-chemicals-dashboard-resource-hub>, PubChem <https://pubchem.ncbi.nlm.nih.gov/>, and others.
Maintained by Claudio Zanettini. Last updated 1 months ago.
0.5 match 3 stars 4.59 score 3 scriptsbioc
ORFhunteR:Predict open reading frames in nucleotide sequences
The ORFhunteR package is a R and C++ library for an automatic determination and annotation of open reading frames (ORF) in a large set of RNA molecules. It efficiently implements the machine learning model based on vectorization of nucleotide sequences and the random forest classification algorithm. The ORFhunteR package consists of a set of functions written in the R language in conjunction with C++. The efficiency of the package was confirmed by the examples of the analysis of RNA molecules from the NCBI RefSeq and Ensembl databases. The package can be used in basic and applied biomedical research related to the study of the transcriptome of normal as well as altered (for example, cancer) human cells.
Maintained by Vasily V. Grinev. Last updated 5 months ago.
technologystatisticalmethodsequencingrnaseqclassificationfeatureextractioncpp
0.5 match 1 stars 4.48 scorenilforooshan
FnR:Inbreeding and Numerator Relationship Coefficients
Compute inbreeding coefficients using the method of Meuwissen and Luo (1992) <doi:10.1186/1297-9686-24-4-305>, and numerator relationship coefficients between individuals using the method of Van Vleck (2007) <https://pubmed.ncbi.nlm.nih.gov/18050089/>.
Maintained by Mohammad Ali Nilforooshan. Last updated 6 months ago.
0.5 match 1 stars 4.30 scorebioc
BloodGen3Module:This R package for performing module repertoire analyses and generating fingerprint representations
The BloodGen3Module package provides functions for R user performing module repertoire analyses and generating fingerprint representations. Functions can perform group comparison or individual sample analysis and visualization by fingerprint grid plot or fingerprint heatmap. Module repertoire analyses typically involve determining the percentage of the constitutive genes for each module that are significantly increased or decreased. As we describe in details;https://www.biorxiv.org/content/10.1101/525709v2 and https://pubmed.ncbi.nlm.nih.gov/33624743/, the results of module repertoire analyses can be represented in a fingerprint format, where red and blue spots indicate increases or decreases in module activity. These spots are subsequently represented either on a grid, with each position being assigned to a given module, or in a heatmap where the samples are arranged in columns and the modules in rows.
Maintained by Darawan Rinchai. Last updated 5 months ago.
softwarevisualizationgeneexpression
0.5 match 4.30 score 5 scriptsbioc
genomes:Genome sequencing project metadata
Download genome and assembly reports from NCBI
Maintained by Chris Stubben. Last updated 5 months ago.
0.6 match 3.48 score 15 scriptssimonliles
protein8k:Perform Analysis and Create Visualizations of Proteins
Read Protein Data Bank (PDB) files, performs its analysis, and presents the result using different visualization types including 3D. The package also has additional capability for handling Virus Report data from the National Center for Biotechnology Information (NCBI) database.
Maintained by Simon Liles. Last updated 4 years ago.
0.5 match 1 stars 4.00 score 4 scriptswbelzak
regDIF:Regularized Differential Item Functioning
Performs regularization of differential item functioning (DIF) parameters in item response theory (IRT) models (Belzak & Bauer, 2020) <https://pubmed.ncbi.nlm.nih.gov/31916799/> using a penalized expectation-maximization algorithm.
Maintained by William Belzak. Last updated 2 years ago.
0.5 match 1 stars 3.81 score 13 scriptseastman
ncbit:Retrieve and Build NBCI Taxonomic Data
Makes NCBI taxonomic data locally available and searchable as an R object.
Maintained by Jon Eastman. Last updated 3 years ago.
0.6 match 3.36 score 2 scripts 29 dependentsbiocool-lab
PSSMCOOL:Features Extracted from Position Specific Scoring Matrix (PSSM)
Returns almost all features that has been extracted from Position Specific Scoring Matrix (PSSM) so far, which is a matrix of L rows (L is protein length) and 20 columns produced by 'PSI-BLAST' which is a program to produce PSSM Matrix from multiple sequence alignment of proteins see <https://www.ncbi.nlm.nih.gov/books/NBK2590/> for mor details. some of these features are described in Zahiri, J., et al.(2013) <DOI:10.1016/j.ygeno.2013.05.006>, Saini, H., et al.(2016) <DOI:10.17706/jsw.11.8.756-767>, Ding, S., et al.(2014) <DOI:10.1016/j.biochi.2013.09.013>, Cheng, C.W., et al.(2008) <DOI:10.1186/1471-2105-9-S12-S6>, Juan, E.Y., et al.(2009) <DOI:10.1109/CISIS.2009.194>.
Maintained by Alireza mohammadi. Last updated 3 years ago.
0.5 match 4 stars 3.53 score 17 scriptscboettig
taxalight:A Lightweight and Lightning-Fast Taxonomic Naming Interface
Creates a local Lightning Memory-Mapped Database ('LMDB') of many commonly used taxonomic authorities and provides functions that can quickly query this data. Supported taxonomic authorities include the Integrated Taxonomic Information System ('ITIS'), National Center for Biotechnology Information ('NCBI'), Global Biodiversity Information Facility ('GBIF'), Catalogue of Life ('COL'), and Open Tree Taxonomy ('OTT'). Name and identifier resolution using 'LMDB' can be hundreds of times faster than either relational databases or internet-based queries. Precise data provenance information for data derived from naming providers is also included.
Maintained by Carl Boettiger. Last updated 4 years ago.
0.5 match 5 stars 3.40 score 4 scriptsecologicaltools
IBRtools:Integrating Biomarker-Based Assessments and Radarchart Creation
Several functions to calculate two important indexes (IBR (Integrated Biomarker Response) and IBRv2 (Integrated Biological Response version 2)), it also calculates the standardized values for enzyme activity for each index, and it has a graphing function to perform radarplots that make great data visualization for this type of data. Beliaeff, B., & Burgeot, T. (2002). <https://pubmed.ncbi.nlm.nih.gov/12069320/>. Sanchez, W., Burgeot, T., & Porcher, J.-M. (2013).<doi:10.1007/s11356-012-1359-1>. Devin, S., Burgeot, T., Giambรฉrini, L., Minguez, L., & Pain-Devin, S. (2014). <doi:10.1007/s11356-013-2169-9>. Minato N. (2022). <https://minato.sip21c.org/msb/>.
Maintained by Anna Carolina Resende. Last updated 2 years ago.
data-visualizationdevtoolsenzyme-activityibribrvindexes-ibrintegrated-biomarker-responseradarchartrstudio-cloudstarplot
0.5 match 3 stars 3.18 score 2 scriptsrgyoung6
MACER:Molecular Acquisition, Cleaning, and Evaluation in R 'MACER'
To assist biological researchers in assembling taxonomically and marker focused molecular sequence data sets. 'MACER' accepts a list of genera as a user input and uses NCBI-GenBank and BOLD as resources to download and assemble molecular sequence datasets. These datasets are then assembled by marker, aligned, trimmed, and cleaned. The use of this package allows the publication of specific parameters to ensure reproducibility. The 'MACER' package has four core functions and an example run through using all of these functions can be found in the associated repository <https://github.com/rgyoung6/MACER_example>.
Maintained by Robert G Young. Last updated 1 years ago.
0.5 match 2 stars 3.00 score 3 scriptsnoramvillanueva
seq2R:Simple Method to Detect Compositional Changes in Genomic Sequences
This software is useful for loading '.fasta' or '.gbk' files, and for retrieving sequences from 'GenBank' dataset <https://www.ncbi.nlm.nih.gov/genbank/>. This package allows to detect differences or asymmetries based on nucleotide composition by using local linear kernel smoothers. Also, it is possible to draw inference about critical points (i. e. maximum or minimum points) related with the derivative curves. Additionally, bootstrap methods have been used for estimating confidence intervals and speed computational techniques (binning techniques) have been implemented in 'seq2R'.
Maintained by Nora M. Villanueva. Last updated 4 months ago.
bootstrapchange-pointsdna-sequencesgenome-analysismachine-learningnonparametric-statisticsregressionfortran
0.5 match 3.00 score 10 scriptsropensci
onekp:Retrieve Data from the 1000 Plants Initiative (1KP)
The 1000 Plants Initiative (www.onekp.com) has sequenced the transcriptomes of over 1000 plant species. This package allows these sequences and metadata to be retrieved and filtered by code, species or recursively by clade. Scientific names and NCBI taxonomy IDs are both supported.
Maintained by Dhakal Rijan. Last updated 2 years ago.
0.5 match 13 stars 2.81 score 4 scriptsbgcarlisle
pubmedtk:'Pubmed' Toolkit
Provides various functions for retrieving and interpreting information from 'Pubmed' via the API, <https://www.ncbi.nlm.nih.gov/home/develop/api/>.
Maintained by Benjamin Gregory Carlisle. Last updated 1 years ago.
0.5 match 2.70 scoreericsleifer
factorial2x2:Design and Analysis of a 2x2 Factorial Trial
Used for the design and analysis of a 2x2 factorial trial for a time-to-event endpoint. It performs power calculations and significance testing as well as providing estimates of the relevant hazard ratios and the corresponding 95% confidence intervals. Important reference papers include Slud EV. (1994) <https://www.ncbi.nlm.nih.gov/pubmed/8086609> Lin DY, Gong J, Gallo P, Bunn PH, Couper D. (2016) <DOI:10.1111/biom.12507> Leifer ES, Troendle JF, Kolecki A, Follmann DA. (2020) <https://github.com/EricSLeifer/factorial2x2/blob/master/Leifer%20et%20al.%20paper.pdf>.
Maintained by Eric Leifer. Last updated 5 years ago.
0.5 match 2.70 scorecran
BimodalIndex:The Bimodality Index
Defines the functions used to compute the bimodal index as defined by Wang et al. (2009) <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2730180/>, <doi:10.4137/CIN.S2846>.
Maintained by Kevin R. Coombes. Last updated 6 years ago.
0.5 match 2.48 score 1 dependentscran
localScore:Package for Sequence Analysis by Local Score
Functionalities for calculating the local score and calculating statistical relevance (p-value) to find a local Score in a sequence of given distribution (S. Mercier and J.-J. Daudin (2001) <https://hal.science/hal-00714174/>) ; S. Karlin and S. Altschul (1990) <https://pmc.ncbi.nlm.nih.gov/articles/PMC53667/> ; S. Mercier, D. Cellier and F. Charlot (2003) <https://hal.science/hal-00937529v1/> ; A. Lagnoux, S. Mercier and P. Valois (2017) <doi:10.1093/bioinformatics/btw699> ).
Maintained by David Robelin. Last updated 20 days ago.
0.5 match 2.30 score 6 scriptslargon-denayah
read.gb:Open GenBank Files
Opens complete record(s) with .gb extension from the NCBI/GenBank Nucleotide database and returns a list containing shaped record(s). These kind of files contains detailed records of DNA samples (locus, organism, type of sequence, source of the sequence...). An example of record can be found at <https://www.ncbi.nlm.nih.gov/nuccore/HE799070>.
Maintained by Robin Mercier. Last updated 4 years ago.
0.8 match 1.48 score 5 scripts 1 dependentspeyronlab
aliases2entrez:Converts Human gene symbols to entrez IDs
Queries multiple resources authors HGNC (2019) <https://www.genenames.org>, authors limma (2015) <doi:10.1093/nar/gkv007> to find the correspondence between evolving nomenclature of human gene symbols, aliases, previous symbols or synonyms with stable, curated gene entrezID from NCBI database. This allows fast, accurate and up-to-date correspondence between human gene expression datasets from various date and platform (e.g: gene symbol: BRCA1 - ID: 672).
Maintained by Raphael Bonnet. Last updated 4 years ago.
0.5 match 1 stars 2.00 score 10 scriptsjodamatta
SLOS:ICU Length of Stay Prediction and Efficiency Evaluation
Provides tools for predicting ICU length of stay and assessing ICU efficiency. It is based on the methodologies proposed by Peres et al. (2022, 2023), which utilize data-driven approaches for modeling and validation, offering insights into ICU performance and patient outcomes. References: Peres et al. (2022)<https://pubmed.ncbi.nlm.nih.gov/35988701/>, Peres et al. (2023)<https://pubmed.ncbi.nlm.nih.gov/37922007/>. More information: <https://github.com/igor-peres/ICU-Length-of-Stay-Prediction>.
Maintained by Joana da Matta. Last updated 1 months ago.
0.8 match 1.30 scorecran
infinitefactor:Bayesian Infinite Factor Models
Sampler and post-processing functions for semi-parametric Bayesian infinite factor models, motivated by the Multiplicative Gamma Shrinkage Prior of Bhattacharya and Dunson (2011) <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3419391/>. Contains component C++ functions for building samplers for linear and 2-way interaction factor models using the multiplicative gamma and Dirichlet-Laplace shrinkage priors. The package also contains post processing functions to return matrices that display rotational ambiguity to identifiability through successive application of orthogonalization procedures and resolution of column label and sign switching. This package was developed with the support of the National Institute of Environmental Health Sciences grant 1R01ES028804-01.
Maintained by Evan Poworoznek. Last updated 5 years ago.
0.5 match 1.70 score 7 scriptscran
GPL2025:Convert Chip ID of the GPL2015 into GeneBank Accession and ENTREZID
Convert the chip ID of GPL2025 <https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL2025> to GeneBank Accession and ENTREZID <http://www.ncbi.nlm.nih.gov/gene>.
Maintained by Xiang Li. Last updated 5 years ago.
0.8 match 1.00 scorecran
Fstability:Calculate Feature Stability
Has two functions to help with calculating feature selection stability. 'Lump' is a function that groups subset vectors into a dataframe, and adds NA to shorter vectors so they all have the same length. 'ASM' is a function that takes a dataframe of subset vectors and the original vector of features as inputs, and calculates the Stability of the feature selection. The calculation for 'asm' uses the Adjusted Stability Measure proposed in: 'Lustgarten', 'Gopalakrishnan', & 'Visweswaran' (2009)<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2815476/>.
Maintained by Nicolas Ewen. Last updated 6 years ago.
0.5 match 1.00 score 2 scriptssyedhaider5
iDOS:Integrated Discovery of Oncogenic Signatures
A method to integrate molecular profiles of cancer patients (gene copy number and mRNA abundance) to identify candidate gain of function alterations. These candidate alterations can be subsequently further tested to discover cancer driver alterations. Briefly, this method tests of genomic correlates of mRNA dysregulation and prioritise those where DNA gains/amplifications are associated with elevated mRNA expression of the same gene. For details see, Haider S et al. (2016) "Genomic alterations underlie a pan-cancer metabolic shift associated with tumour hypoxia", Genome Biology, <https://pubmed.ncbi.nlm.nih.gov/27358048/>.
Maintained by Syed Haider. Last updated 1 years ago.
0.5 match 1.00 score 10 scripts