R-universe search: synonyms

ropensci

taxize:Taxonomic Information from Around the Web

Interacts with a suite of web application programming interfaces (API) for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more. Some of the services supported include 'NCBI E-utilities' (<https://www.ncbi.nlm.nih.gov/books/NBK25501/>), 'Encyclopedia of Life' (<https://eol.org/docs/what-is-eol/data-services>), 'Global Biodiversity Information Facility' (<https://techdocs.gbif.org/en/openapi/>), and many more. Links to the API documentation for other supported services are available in the documentation for their respective functions in this package.

Maintained by Zachary Foster. Last updated 14 days ago.

taxonomy biology nomenclature json api web api-client identifiers species names api-wrapper biodiversity darwincore data taxize

10.4 match 274 stars 13.63 score 1.6k scripts 23 dependents

njtierney

syn:Creates Synonyms From Target Words

Generates synonyms from a given word drawing from a synonym list from the 'moby' project <http://moby-thesaurus.org/>.

Maintained by Nicholas Tierney. Last updated 1 years ago.

antonyms ozunconf18 synonyms text-processing thesaurus unconf

15.8 match 52 stars 6.37 score 30 scripts 2 dependents

ctn-0094

DOPE:Drug Ontology Parsing Engine

Provides information on drug names (brand, generic and street) for drugs tracked by the DEA. There are functions that will search synonyms and return the drug names and types. The vignettes have extensive information on the work done to create the data for the package.

Maintained by Raymond Balise. Last updated 4 years ago.

11.9 match 21 stars 7.83 score 31 scripts

rapporter

rapportools:Miscellaneous (Stats) Helper Functions with Sane Defaults for Reporting

Helper functions that act as wrappers to more advanced statistical methods with the advantage of having sane defaults for quick reporting.

Maintained by Gergely Daróczi. Last updated 18 days ago.

11.5 match 8 stars 7.50 score 186 scripts 11 dependents

bioc

SynMut:SynMut: Designing Synonymously Mutated Sequences with Different Genomic Signatures

There are increasing demands on designing virus mutants with specific dinucleotide or codon composition. This tool can take both dinucleotide preference and/or codon usage bias into account while designing mutants. It is a powerful tool for in silico designs of DNA sequence mutants.

Maintained by Haogao Gu. Last updated 5 months ago.

sequencematching experimentaldesign preprocessing

16.9 match 2 stars 4.30 score 1 scripts

interstellar-consultation-services

covid19dbcand:Selected 'Drugbank' Drugs for COVID-19 Treatment Related Data in R Format

Provides different datasets parsed from 'Drugbank' <https://www.drugbank.ca/covid-19> database using 'dbparser' package. It is a smaller version from 'dbdataset' package. It contains only information about COVID-19 possible treatment.

Maintained by Mohammed Ali. Last updated 11 months ago.

dataset dbparser drugbank drugbank-database

15.8 match 3 stars 4.48 score 6 scripts

ropensci

rfishbase:R Interface to 'FishBase'

A programmatic interface to 'FishBase', re-written based on an accompanying 'RESTful' API. Access tables describing over 30,000 species of fish, their biology, ecology, morphology, and more. This package also supports experimental access to 'SeaLifeBase' data, which contains nearly 200,000 species records for all types of aquatic life not covered by 'FishBase.'

Maintained by Carl Boettiger. Last updated 3 months ago.

fish fishbase taxonomy

6.0 match 116 stars 10.11 score 764 scripts 2 dependents

ropensci

rotl:Interface to the 'Open Tree of Life' API

An interface to the 'Open Tree of Life' API to retrieve phylogenetic trees, information about studies used to assemble the synthetic tree, and utilities to match taxonomic names to 'Open Tree identifiers'. The 'Open Tree of Life' aims at assembling a comprehensive phylogenetic tree for all named species.

Maintained by Francois Michonneau. Last updated 2 years ago.

metadata ropensci phylogenetics independant-contrasts biodiversity peer-reviewed phylogeny taxonomy

4.9 match 40 stars 12.05 score 356 scripts 29 dependents

trinker

qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis

Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.

Maintained by Tyler Rinker. Last updated 4 years ago.

qdap quantitative-discourse-analysis text-analysis text-mining text-plotting openjdk

5.3 match 176 stars 9.61 score 1.3k scripts 3 dependents

bioc

AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor

Implements a user-friendly interface for querying SQLite-based annotation data packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation microarray sequencing genomeannotation bioconductor-package core-package

3.3 match 9 stars 15.05 score 3.6k scripts 769 dependents

selcukorkmaz

PubChemR:Interface to the 'PubChem' Database for Chemical Data Retrieval

Provides an interface to the 'PubChem' database via the PUG REST <https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest> and PUG View <https://pubchem.ncbi.nlm.nih.gov/docs/pug-view> services. This package allows users to automatically access chemical and biological data from 'PubChem', including compounds, substances, assays, and various other data types. Functions are available to retrieve data in different formats, perform searches, and access detailed annotations.

Maintained by Selcuk Korkmaz. Last updated 6 months ago.

8.3 match 2 stars 5.62 score 23 scripts

ropensci

worrms:World Register of Marine Species (WoRMS) Client

Client for World Register of Marine Species (<https://www.marinespecies.org/>). Includes functions for each of the API methods, including searching for names by name, date and common names, searching using external identifiers, fetching synonyms, as well as fetching taxonomic children and taxonomic classification.

Maintained by Bart Vanhoorne.. Last updated 1 years ago.

biology science marine api web api-client worms species api-wrapper biological-data fish jerico-relevant marine-biology marine-species taxize taxonomy

4.5 match 27 stars 9.79 score 372 scripts 23 dependents

usepa

ctxR:Utilities for Interacting with the 'CTX' APIs

Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://www.epa.gov/comptox-tools/computational-toxicology-and-exposure-apis>. 'ctxR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.

Maintained by Paul Kruse. Last updated 2 months ago.

ccte comptox ord

5.2 match 10 stars 8.02 score 13 scripts 1 dependents

alexchristensen

SemNetDictionaries:Dictionaries for the 'SemNetCleaner' Package

Implements dictionaries that can be used in the 'SemNetCleaner' package. Also includes several functions aimed at facilitating the text cleaning analysis in the 'SemNetCleaner' package. This package is designed to integrate and update word lists and dictionaries based on each user's individual needs by allowing users to store and save their own dictionaries. Dictionaries can be added to the 'SemNetDictionaries' package by submitting user-defined dictionaries to <https://github.com/AlexChristensen/SemNetDictionaries>.

Maintained by Alexander P. Christensen. Last updated 3 years ago.

dictionaries semantic-network-analysis

8.0 match 4 stars 5.08 score 3 scripts 2 dependents

andysouth

rworldmap:Mapping Global Data

Enables mapping of country level and gridded user datasets.

Maintained by Andy South. Last updated 2 years ago.

3.3 match 30 stars 11.83 score 3.2k scripts 14 dependents

darwin-eu

CodelistGenerator:Identify Relevant Clinical Codes and Evaluate Their Use

Generate a candidate code list for the Observational Medical Outcomes Partnership (OMOP) common data model based on string matching. For a given search strategy, a candidate code list will be returned.

Maintained by Edward Burn. Last updated 27 days ago.

3.9 match 13 stars 9.87 score 165 scripts 4 dependents

bioc

annotate:Annotation for microarrays

Using R enviroments for annotation.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation pathways go

3.3 match 11.41 score 812 scripts 243 dependents

ropensci

taxlist:Handling Taxonomic Lists

Handling taxonomic lists through objects of class 'taxlist'. This package provides functions to import species lists from 'Turboveg' (<https://www.synbiosys.alterra.nl/turboveg/>) and the possibility to create backups from resulting R-objects. Also quick displays are implemented as summary-methods.

Maintained by Miguel Alvarez. Last updated 6 months ago.

4.8 match 12 stars 7.07 score 81 scripts 2 dependents

usepa

ccdR:Utilities for Interacting with the 'CTX' APIs

Access chemical, hazard, bioactivity, and exposure data from the Computational Toxicology and Exposure ('CTX') APIs <https://api-ccte.epa.gov/docs/>. 'ccdR' was developed to streamline the process of accessing the information available through the 'CTX' APIs without requiring prior knowledge of how to use APIs. Most data is also available on the CompTox Chemical Dashboard ('CCD') <https://comptox.epa.gov/dashboard/> and other resources found at the EPA Computational Toxicology and Exposure Online Resources <https://www.epa.gov/comptox-tools>.

Maintained by Paul Kruse. Last updated 8 months ago.

5.2 match 2 stars 6.38 score 7 scripts

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

1.8 match 2.4k stars 16.86 score 50k scripts 73 dependents

bioc

gDRutils:A package with helper functions for processing drug response data

This package contains utility functions used throughout the gDR platform to fit data, manipulate data, and convert and validate data structures. This package also has the necessary default constants for gDR platform. Many of the functions are utilized by the gDRcore package.

Maintained by Arkadiusz Gladki. Last updated 6 days ago.

software infrastructure

3.8 match 2 stars 7.40 score 3 scripts 3 dependents

rfhb

ctrdata:Retrieve and Analyze Clinical Trials in Public Registers

A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', <https://www.clinicaltrialsregister.eu/>), 'ClinicalTrials.gov' (<https://clinicaltrials.gov/> and also translating queries the retired classic interface), the 'ISRCTN' (<http://www.isrctn.com/>) and the 'European Union Clinical Trials Information System' ('CTIS', <https://euclinicaltrials.eu/>). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for meta-analysis and trend-analysis of the design and conduct as well as of the results of clinical trials across registers.

Maintained by Ralf Herold. Last updated 3 days ago.

clinical-data clinical-research clinical-studies clinical-trials ctgov database duckdb mongodb nodbi postgresql register sqlite studies trial

3.1 match 45 stars 7.92 score 32 scripts

trinker

qdapDictionaries:Dictionaries and Word Lists for the 'qdap' Package

A collection of text analysis dictionaries and word lists for use with the 'qdap' package.

Maintained by Tyler Rinker. Last updated 7 years ago.

4.0 match 4 stars 5.99 score 113 scripts 6 dependents

kurthornik

wordnet:WordNet Interface

An interface to WordNet using the Jawbone Java API to WordNet. WordNet (<https://wordnet.princeton.edu/>) is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. Please note that WordNet(R) is a registered tradename. Princeton University makes WordNet available to research and commercial users free of charge provided the terms of their license (<https://wordnet.princeton.edu/license-and-commercial-use>) are followed, and proper reference is made to the project using an appropriate citation (<https://wordnet.princeton.edu/citing-wordnet>). The WordNet database files need to be made available separately, either via package 'wordnetDicts' from <https://datacube.wu.ac.at>, installing system packages where available, or direct download from <https://wordnetcode.princeton.edu/3.0/WNdb-3.0.tar.gz>.

Maintained by Kurt Hornik. Last updated 9 months ago.

openjdk

7.4 match 2 stars 3.13 score 67 scripts

biagolini

GenTag:Generate Color Tag Sequences

Implement a coherent and flexible protocol for animal color tagging. 'GenTag' provides a simple computational routine with low CPU usage to create color sequences for animal tag. First, a single-color tag sequence is created from an algorithm selected by the user, followed by verification of the combination uniqueness. Three methods to produce color tag sequences are provided. Users can modify the main function core to allow a wide range of applications.

Maintained by Carlos Biagolini-Jr.. Last updated 6 years ago.

6.0 match 3.70 score

gustavobio

flora:Tools for Interacting with the Brazilian Flora 2020

Tools to quickly compile taxonomic and distribution data from the Brazilian Flora 2020.

Maintained by Gustavo Carvalho. Last updated 1 years ago.

4.1 match 29 stars 5.37 score 54 scripts 1 dependents

predictiveecology

SpaDES.core:Core Utilities for Developing and Running Spatially Explicit Discrete Event Models

Provides the core framework for a discrete event system to implement a complete data-to-decisions, reproducible workflow. The core components facilitate the development of modular pieces, and enable the user to include additional functionality by running user-built modules. Includes conditional scheduling, restart after interruption, packaging of reusable modules, tools for developing arbitrary automated workflows, automated interweaving of modules of different temporal resolution, and tools for visualizing and understanding the within-project dependencies. The suggested package 'NLMR' can be installed from the repository (<https://PredictiveEcology.r-universe.dev>).

Maintained by Eliot J B McIntire. Last updated 21 days ago.

discrete-events-simulations simulation-framework simulation-modeling

2.0 match 10 stars 10.61 score 142 scripts 6 dependents

bioc

ontoProc:processing of ontologies of anatomy, cell lines, and so on

Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.

Maintained by Vincent Carey. Last updated 5 days ago.

infrastructure go bioinformatics genomics ontology

3.3 match 3 stars 6.37 score 75 scripts 2 dependents

ropensci

webchem:Chemical Information from the Web

Chemical information from around the web. This package interacts with a suite of web services for chemical information. Sources include: Alan Wood's Compendium of Pesticide Common Names, Chemical Identifier Resolver, ChEBI, Chemical Translation Service, ChemSpider, ETOX, Flavornet, NIST Chemistry WebBook, OPSIN, PubChem, SRS, Wikidata.

Maintained by Tamás Stirling. Last updated 3 months ago.

cas-number chemical-information chemspider identifier ropensci webscraping

2.0 match 165 stars 10.31 score 173 scripts 10 dependents

wevertonbio

florabr:Explore Flora e Funga do Brasil Database

A collection of functions designed to retrieve, filter and spatialize data from the Flora e Funga do Brasil dataset. For more information about the dataset, please visit <https://floradobrasil.jbrj.gov.br/consulta/>.

Maintained by Weverton Trindade. Last updated 3 months ago.

3.5 match 2 stars 5.72 score 12 scripts

s87jackson

rfars:Download and Analyze Crash Data

Download crash data from the National Highway Traffic Safety Administration and prepare it for research.

Maintained by Steve Jackson. Last updated 12 months ago.

crash fatalities official-statistics transportation

3.6 match 10 stars 5.35 score 15 scripts

nataliepatten

gatoRs:Geographic and Taxonomic Occurrence R-Based Scrubbing

Streamlines downloading and cleaning biodiversity data from Integrated Digitized Biocollections (iDigBio) and the Global Biodiversity Information Facility (GBIF).

Maintained by Natalie N. Patten. Last updated 10 months ago.

3.1 match 11 stars 6.16 score 66 scripts

cran

Rdiagnosislist:Manipulate SNOMED CT Diagnosis Lists

Functions and methods for manipulating 'SNOMED CT' concepts. The package contains functions for loading the 'SNOMED CT' release into a convenient R environment, selecting 'SNOMED CT' concepts using regular expressions, and navigating the 'SNOMED CT' ontology. It provides the 'SNOMEDconcept' S3 class for a vector of 'SNOMED CT' concepts (stored as 64-bit integers) and the 'SNOMEDcodelist' S3 class for a table of concepts IDs with descriptions. The package can be used to construct sets of 'SNOMED CT' concepts for research (<doi:10.1093/jamia/ocac158>). For more information about 'SNOMED CT' visit <https://www.snomed.org/>.

Maintained by Anoop D. Shah. Last updated 2 months ago.

5.1 match 1 stars 3.60 score

bioc

OmnipathR:OmniPath web service client and more

A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).

Maintained by Denes Turei. Last updated 21 days ago.

graphandnetwork network pathways software thirdpartyclient dataimport datarepresentation genesignaling generegulation systemsbiology transcriptomics singlecell annotation kegg complexes enzyme-ptm networks networks-biology omnipath proteins quarto

1.8 match 126 stars 9.90 score 226 scripts 2 dependents

winvector

rquery:Relational Query Generator for Data Manipulation at Scale

A piped query generator based on Edgar F. Codd's relational algebra, and on production experience using 'SQL' and 'dplyr' at big data scale. The design represents an attempt to make 'SQL' more teachable by denoting composition by a sequential pipeline notation instead of nested queries or functions. The implementation delivers reliable high performance data processing on large data systems such as 'Spark', databases, and 'data.table'. Package features include: data processing trees or pipelines as observable objects (able to report both columns produced and columns used), optimized 'SQL' generation as an explicit user visible table modeling step, plus explicit query reasoning and checking.

Maintained by John Mount. Last updated 2 years ago.

1.8 match 110 stars 9.53 score 126 scripts 3 dependents

bioc

BioQC:Detect tissue heterogeneity in expression profiles with gene sets

BioQC performs quality control of high-throughput expression data based on tissue gene signatures. It can detect tissue heterogeneity in gene expression data. The core algorithm is a Wilcoxon-Mann-Whitney test that is optimised for high performance.

Maintained by Jitao David Zhang. Last updated 5 months ago.

geneexpression qualitycontrol statisticalmethod genesetenrichment cpp

2.0 match 5 stars 8.16 score 86 scripts

skranz

stringtools:Tools for working with strings in R

Tools for working with strings in R

Maintained by Sebastian Kranz. Last updated 3 years ago.

4.3 match 2 stars 3.66 score 29 scripts 26 dependents

matutosi

moranajp:Morphological Analysis for Japanese

Supports morphological analysis for Japanese by using 'MeCab' <https://taku910.github.io/mecab/>, 'Sudachi' <https://github.com/WorksApplications/Sudachi>, 'Chamame' <https://chamame.ninjal.ac.jp/>, or 'Ginza' <https://github.com/megagonlabs/ginza>. Can input a data.frame and obtain all results of 'MeCab' and the row number of the original data.frame as a text id.

Maintained by Toshikazu Matsumura. Last updated 8 months ago.

3.8 match 4.13 score 17 scripts

ropensci

taxadb:A High-Performance Local Taxonomic Database Interface

Creates a local database of many commonly used taxonomic authorities and provides functions that can quickly query this data.

Maintained by Carl Boettiger. Last updated 11 months ago.

2.0 match 43 stars 7.68 score 53 scripts 1 dependents

bioc

brendaDb:The BRENDA Enzyme Database

R interface for importing and analyzing enzyme information from the BRENDA database.

Maintained by Yi Zhou. Last updated 5 months ago.

thirdpartyclient annotation dataimport brenda database enzyme hacktoberfest cpp

3.2 match 2 stars 4.60 score 4 scripts

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

1.8 match 3 stars 8.20 score 7.8k scripts 11 dependents

ryentes

careless:Procedures for Computing Indices of Careless Responding

When taking online surveys, participants sometimes respond to items without regard to their content. These types of responses, referred to as careless or insufficient effort responding, constitute significant problems for data quality, leading to distortions in data analysis and hypothesis testing, such as spurious correlations. The 'R' package 'careless' provides solutions designed to detect such careless / insufficient effort responses by allowing easy calculation of indices proposed in the literature. It currently supports the calculation of longstring, even-odd consistency, psychometric synonyms/antonyms, Mahalanobis distance, and intra-individual response variability (also termed inter-item standard deviation). For a review of these methods, see Curran (2016) <doi:10.1016/j.jesp.2015.07.006>.

Maintained by Richard Yentes. Last updated 1 years ago.

2.3 match 26 stars 6.34 score 85 scripts

ropensci

ritis:Integrated Taxonomic Information System Client

An interface to the Integrated Taxonomic Information System ('ITIS') (<https://www.itis.gov>). Includes functions to work with the 'ITIS' REST API methods (<https://www.itis.gov/ws_description.html>), as well as the 'Solr' web service (<https://www.itis.gov/solr_documentation.html>).

Maintained by Julia Blum. Last updated 1 months ago.

taxonomy biology nomenclature json api web api-client identifiers species names api-wrapper itis taxize

1.9 match 16 stars 7.72 score 64 scripts 24 dependents

aliyoussef96

vhcub:Virus-Host Codon Usage Co-Adaptation Analysis

Analyze the co-adaptation of codon usage between a virus and its host,calculate various codon usage bias measurements as; effective number of codons (ENc) Novembre (2002) <doi:10.1093/oxfordjournals.molbev.a004201>, codon adaptation index (CAI) Sharp and Li (1987) <doi:10.1093/nar/15.3.1281>, relative codon deoptimization index (RCDI) Puigbò et al (2010) <doi:10.1186/1756-0500-3-87>, similarity index (SiD) Zhou et al (2013) <doi:10.1371/journal.pone.0077239>, synonymous codon usage orderliness (SCUO) Wan et al (2004) <doi:10.1186/1471-2148-4-19> and, relative synonymous codon usage (RSCU) Sharp et al (1986) <doi:10.1093/nar/14.13.5125>. Also, it provides a statistical dinucleotide over- and underrepresentation with three different models. Implement several methods for visualization of codon usage as ENc.GC3plot() and PR2.plot().

Maintained by Ali Mostafa Anwar. Last updated 1 years ago.

4.4 match 4 stars 3.30 score

jokergoo

GlobalOptions:Generate Functions to Get or Set Global Options

It provides more configurations on the option values such as validation and filtering on the values, making options invisible or private.

Maintained by Zuguang Gu. Last updated 9 months ago.

1.5 match 9 stars 9.48 score 40 scripts 217 dependents

ftwkoopmans

goat:Gene Set Analysis Using the Gene Set Ordinal Association Test

Perform gene set enrichment analyses using the Gene set Ordinal Association Test (GOAT) algorithm and visualize your results. Koopmans, F. (2024) <doi:10.1038/s42003-024-06454-5>.

Maintained by Frank Koopmans. Last updated 24 days ago.

bioinformatics geneset-enrichment geneset-enrichment-analysis cpp openmp

3.2 match 10 stars 4.40 score 8 scripts

pasraia

RRphylo:Phylogenetic Ridge Regression Methods for Comparative Studies

Functions for phylogenetic analysis (Castiglione et al., 2018 <doi:10.1111/2041-210X.12954>). The functions perform the estimation of phenotypic evolutionary rates, identification of phenotypic evolutionary rate shifts, quantification of direction and size of evolutionary change in multivariate traits, the computation of ontogenetic shape vectors and test for morphological convergence.

Maintained by Silvia Castiglione. Last updated 1 days ago.

1.8 match 10 stars 7.78 score 83 scripts

bioc

VarCon:VarCon: an R package for retrieving neighboring nucleotides of an SNV

VarCon is an R package which converts the positional information from the annotation of an single nucleotide variation (SNV) (either referring to the coding sequence or the reference genomic sequence). It retrieves the genomic reference sequence around the position of the single nucleotide variation. To asses, whether the SNV could potentially influence binding of splicing regulatory proteins VarCon calcualtes the HEXplorer score as an estimation. Besides, VarCon additionally reports splice site strengths of splice sites within the retrieved genomic sequence and any changes due to the SNV.

Maintained by Johannes Ptok. Last updated 5 months ago.

functionalgenomics alternativesplicing

3.3 match 4.00 score 5 scripts

bernd-mueller

epos:Epilepsy Ontologies' Similarities

Analysis and visualization of similarities between epilepsy ontologies based on text mining results by comparing ranked lists of co-occurring drug terms in the BioASQ corpus. The ranked result lists of neurological drug terms co-occurring with terms from the epilepsy ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS undergo further analysis. The source data to create the ranked lists of drug names is produced using the text mining workflows described in Mueller, Bernd and Hagelstein, Alexandra (2016) <doi:10.4126/FRL01-006408558>, Mueller, Bernd et al. (2017) <doi:10.1007/978-3-319-58694-6_22>, Mueller, Bernd and Rebholz-Schuhmann, Dietrich (2020) <doi:10.1007/978-3-030-43887-6_52>, and Mueller, Bernd et al. (2022) <doi:10.1186/s13326-021-00258-w>.

Maintained by Bernd Mueller. Last updated 1 years ago.

3.3 match 4.03 score 53 scripts

sborstein

AnnotationBustR:Extract Subsequences from GenBank Annotations

Extraction of subsequences into FASTA files from GenBank annotations where gene names may vary among accessions. Borstein & O'Meara (2018) <doi:10.7717/peerj.5179>.

Maintained by Samuel R. Borstein. Last updated 6 months ago.

2.8 match 5 stars 4.78 score 12 scripts

mt1022

cubar:Codon Usage Bias Analysis

A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables.

Maintained by Hong Zhang. Last updated 3 months ago.

bioinformatics codon-usage machine-learning sequence-analysis

2.3 match 6 stars 5.82 score 8 scripts

cran

ensembleTax:Ensemble Taxonomic Assignments of Amplicon Sequencing Data

Creates ensemble taxonomic assignments of amplicon sequencing data in R using outputs of multiple taxonomic assignment algorithms and/or reference databases. Includes flexible algorithms for mapping taxonomic nomenclatures onto one another and for computing ensemble taxonomic assignments.

Maintained by Dylan Catlett. Last updated 4 years ago.

4.8 match 2.48 score 7 scripts

fguenther

LSAfun:Applied Latent Semantic Analysis (LSA) Functions

Provides functions that allow for convenient working with vector space models of semantics/distributional semantic models/word embeddings. Originally built for LSA models (hence the name), but can be used for all such vector-based models. For actually building a vector semantic space, use the package 'lsa' or other specialized software. Downloadable semantic spaces can be found at <https://sites.google.com/site/fritzgntr/software-resources>.

Maintained by Fritz Guenther. Last updated 1 years ago.

3.6 match 1 stars 3.18 score 85 scripts 1 dependents

bioc

doubletrouble:Identification and classification of duplicated genes

doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.

Maintained by Fabrício Almeida-Silva. Last updated 5 days ago.

software wholegenome comparativegenomics functionalgenomics phylogenetics network classification bioinformatics comparative-genomics gene-duplication molecular-evolution whole-genome-duplication

1.8 match 23 stars 6.44 score 17 scripts

snoweye

cubfits:Codon Usage Bias Fits

Estimating mutation and selection coefficients on synonymous codon bias usage based on models of ribosome overhead cost (ROC). Multinomial logistic regression and Markov Chain Monte Carlo are used to estimate and predict protein production rates with/without the presence of expressions and measurement errors. Work flows with examples for simulation, estimation and prediction processes are also provided with parallelization speedup. The whole framework is tested with yeast genome and gene expression data of Yassour, et al. (2009) <doi:10.1073/pnas.0812841106>.

Maintained by Wei-Chen Chen. Last updated 3 years ago.

2.2 match 7 stars 4.83 score 32 scripts

bioc

regutools:regutools: an R package for data extraction from RegulonDB

RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.

Maintained by Joselyn Chavez. Last updated 3 months ago.

generegulation geneexpression systemsbiology network networkinference visualization transcription bioconductor cdsb regulondb

2.0 match 4 stars 5.20 score 6 scripts

ropensci

naijR:Operations to Ease Data Analyses Specific to Nigeria

A set of convenience functions as well as geographical/political data about Nigeria, aimed at simplifying work with data and information that are specific to the country.

Maintained by Victor Ordu. Last updated 5 months ago.

1.9 match 12 stars 5.38 score 9 scripts

phuse-org

sendigR:Enable Cross-Study Analysis of 'CDISC' 'SEND' Datasets

A system enables cross study Analysis by extracting and filtering study data for control animals from 'CDISC' 'SEND' Study Repository. These data types are supported: Body Weights, Laboratory test results and Microscopic findings. These database types are supported: 'SQLite' and 'Oracle'.

Maintained by Wenxian Wang. Last updated 12 days ago.

1.6 match 12 stars 6.28 score 6 scripts

ha-pu

globaltrends:Download and Measure Global Trends Through Google Search Volumes

Google offers public access to global search volumes from its search engine through the Google Trends portal. The package downloads these search volumes provided by Google Trends and uses them to measure and analyze the distribution of search scores across countries or within countries. The package allows researchers and analysts to use these search scores to investigate global trends based on patterns within these scores. This offers insights such as degree of internationalization of firms and organizations or dissemination of political, social, or technological trends across the globe or within single countries. An outline of the package's methodological foundations and potential applications is available as a working paper: <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3969013>.

Maintained by Harald Puhr. Last updated 2 years ago.

google-trends internationalization

1.9 match 18 stars 5.00 score 11 scripts

jpearson0525

micromapST:Linked Micromap Plots for U. S. and Other Geographic Areas

Provides the users with the ability to quickly create linked micromap plots for a collection of geographic areas. Linked micromap plots are visualizations of geo-referenced data that link statistical graphics to an organized series of small maps or graphic images. The Help description contains examples of how to use the 'micromapST' function. Contained in this package are border group datasets to support creating linked micromap plots for the 50 U.S. states and District of Columbia (51 areas), the U. S. 20 Seer Registries, the 105 counties in the state of Kansas, the 62 counties of New York, the 24 counties of Maryland, the 29 counties of Utah, the 32 administrative areas in China, the 218 administrative areas in the UK and Ireland (for testing only), the 25 districts in the city of Seoul South Korea, and the 52 counties on the Africa continent. A border group dataset contains the boundaries related to the data level areas, a second layer boundaries, a top or third layer boundary, a parameter list of run options, and a cross indexing table between area names, abbreviations, numeric identification and alias matching strings for the specific geographic area. By specifying a border group, the package create linked micromap plots for any geographic region. The user can create and provide their own border group dataset for any area beyond the areas contained within the package. In version 3.0.0, the 'BuildBorderGroup' function was upgraded to not use the retiring 'maptools', 'rgdal', and 'rgeos' packages. References: Carr and Pickle, Chapman and Hall/CRC, Visualizing Data Patterns with Micromaps, CRC Press, 2010. Pickle, Pearson, and Carr (2015), micromapST: Exploring and Communicating Geospatial Patterns in US State Data., Journal of Statistical Software, 63(3), 1-25., <https://www.jstatsoft.org/v63/i03/>. Copyrighted 2013, 2014, 2015, 2016, 2022, 2023, 2024, and 2025 by Carr, Pearson and Pickle.

Maintained by Jim Pearson. Last updated 1 months ago.

3.3 match 2.80 score 21 scripts

greifflab

immuneSIM:Tunable Simulation of B- And T-Cell Receptor Repertoires

Simulate full B-cell and T-cell receptor repertoires using an in silico recombination process that includes a wide variety of tunable parameters to introduce noise and biases. Additional post-simulation modification functions allow the user to implant motifs or codon biases as well as remodeling sequence similarity architecture. The output repertoires contain records of all relevant repertoire dimensions and can be analyzed using provided repertoire analysis functions. Preprint is available at bioRxiv (Weber et al., 2019 <doi:10.1101/759795>).

Maintained by Cédric R. Weber. Last updated 1 years ago.

1.9 match 37 stars 4.44 score 15 scripts

bioc

OmicsMLRepoR:Search harmonized metadata created under the OmicsMLRepo project

This package provides functions to browse the harmonized metadata for large omics databases. This package also supports data navigation if the metadata incorporates ontology.

Maintained by Sehyun Oh. Last updated 1 months ago.

software infrastructure datarepresentation

1.5 match 5.40 score 14 scripts

ropensci

DataSpaceR:Interface to 'the CAVD DataSpace'

Provides a convenient API interface to access immunological data within 'the CAVD DataSpace'(<https://dataspace.cavd.org>), a data sharing and discovery tool that facilitates exploration of HIV immunological data from pre-clinical and clinical HIV vaccine studies.

Maintained by Jason Taylor. Last updated 26 days ago.

cavd-dataspace

1.2 match 5 stars 6.72 score 42 scripts

theropod1

paleoDiv:Extracting and Visualizing Paleobiodiversity

Contains various tools for conveniently downloading and editing taxon-specific datasets from the Paleobiology Database <https://paleobiodb.org>, extracting information on abundance, temporal distribution of subtaxa and taxonomic diversity through deep time, and visualizing these data in relation to phylogeny and stratigraphy.

Maintained by Darius Nau. Last updated 5 months ago.

3.0 match 2 stars 2.60 score

florianjansen

vegdata:Access Vegetation Databases and Treat Taxonomy

Handling of vegetation data from different sources ( Turboveg 2.0 <https://www.synbiosys.alterra.nl/turboveg/>; the German national repository <https://www.vegetweb.de> and others. Taxonomic harmonization (given appropriate taxonomic lists, e.g. the German taxonomic standard list "GermanSL", <https://germansl.infinitenature.org>).

Maintained by Florian Jansen. Last updated 1 years ago.

2.0 match 2 stars 3.84 score 38 scripts 3 dependents

ropensci

dwctaxon:Edit and Validate Darwin Core Taxon Data

Edit and validate taxonomic data in compliance with Darwin Core standards (Darwin Core 'Taxon' class <https://dwc.tdwg.org/terms/#taxon>).

Maintained by Joel H. Nitta. Last updated 8 months ago.

database

1.3 match 6 stars 6.13 score 28 scripts

joelnitta

taxastand:Taxonomic Name Standardization

Matches species names to a taxonomic standard. Resolves synonyms consistently and reproducibly.

Maintained by Joel Nitta. Last updated 2 years ago.

database taxonomy

2.4 match 20 stars 3.04 score 11 scripts

bmaitner

TNRS:Taxonomic Name Resolution Service

Provides access to the Taxonomic Name Resolution Service <https://github.com/ojalaquellueva/tnrsapi> through R. The user supplies plant taxonomic names and the package returns resolved taxonomic names along with information on decisions. Optionally, the package can also be used to parse taxonomic names.

Maintained by Brian Maitner. Last updated 10 months ago.

1.9 match 3.91 score 41 scripts

kamapu

vegtable:Handling Vegetation Data Sets

Import and handling data from vegetation-plot databases, especially data stored in 'Turboveg 2' (<https://www.synbiosys.alterra.nl/turboveg/>). Also import/export routines for exchange of data with 'Juice' (<https://www.sci.muni.cz/botany/juice/>) are implemented.

Maintained by Miguel Alvarez. Last updated 8 months ago.

1.7 match 7 stars 4.23 score 49 scripts

clandere

AnaCoDa:Analysis of Codon Data under Stationarity using a Bayesian Framework

Is a collection of models to analyze genome scale codon data using a Bayesian framework. Provides visualization routines and checkpointing for model fittings. Currently published models to analyze gene data for selection on codon usage based on Ribosome Overhead Cost (ROC) are: ROC (Gilchrist et al. (2015) <doi:10.1093/gbe/evv087>), and ROC with phi (Wallace & Drummond (2013) <doi:10.1093/molbev/mst051>). In addition 'AnaCoDa' contains three currently unpublished models. The FONSE (First order approximation On NonSense Error) model analyzes gene data for selection on codon usage against of nonsense error rates. The PA (PAusing time) and PANSE (PAusing time + NonSense Error) models use ribosome footprinting data to analyze estimate ribosome pausing times with and without nonsense error rate from ribosome footprinting data.

Maintained by Cedric Landerer. Last updated 4 years ago.

cpp openmp

1.8 match 1 stars 4.00 score 100 scripts

roelandkindt

WorldFlora:Standardize Plant Names According to World Flora Online Taxonomic Backbone

World Flora Online is an online flora of all known plants, available from <https://www.worldfloraonline.org/>. Methods are provided of matching a list of plant names (scientific names, taxonomic names, botanical names) against a static copy of the World Flora Online Taxonomic Backbone data that can be downloaded from the World Flora Online website. The World Flora Online Taxonomic Backbone is an updated version of The Plant List (<http://www.theplantlist.org/>), a working list of plant names that has become static since 2013.

Maintained by Roeland Kindt. Last updated 6 months ago.

2.3 match 3 stars 3.09 score 33 scripts 1 dependents

otoliths

SP2000:Catalogue of Life Toolkit

A programmatic interface to <http://sp2000.org.cn>, re-written based on an accompanying 'Species 2000' API. Access tables describing catalogue of the Chinese known species of animals, plants, fungi, micro-organisms, and more. This package also supports access to catalogue of life global <http://catalogueoflife.org>, China animal scientific database <http://zoology.especies.cn> and catalogue of life Taiwan <https://taibnet.sinica.edu.tw/home_eng.php>. The development of 'SP2000' package were supported by Biodiversity Survey and Assessment Project of the Ministry of Ecology and Environment, China <2019HJ2096001006>,Yunnan University's "Double First Class" Project <C176240405> and Yunnan University's Research Innovation Fund for Graduate Students <2019227>.

Maintained by Liuyong Ding. Last updated 1 years ago.

animals biodiversity catalogue-of-life-china catalogue-of-life-global-checklist catalogue-of-life-taiwan-checklist china china-animal-scientific-database fungi microorganisms plants redlist-of-chinese-biodiversity species2000

1.8 match 13 stars 3.81 score 3 scripts

rekyt

rtaxref:An R client for TaxRef the French Taxonomical Database

Provides an R client to the TaxRef API <https://taxref.mnhn.fr/taxref-web/api/doc>, the French Taxonomical Reference Database which indexes names of species with unique identifiers as well as conservation statuses, biological interactions and taxonomic relationships.

Maintained by Matthias Grenié. Last updated 3 years ago.

api api-client api-wrapper biodiversity taxonomy

1.9 match 5 stars 3.40 score 5 scripts

ropensci

rusda:Interface to USDA Databases

An interface to the web service methods provided by the United States Department of Agriculture (USDA). The Agricultural Research Service (ARS) provides a large set of databases. The current version of the package holds interfaces to the Systematic Mycology and Microbiology Laboratory (SMML), which consists of four databases: Fungus-Host Distributions, Specimens, Literature and the Nomenclature database. It provides functions for querying these databases. The main function is \code{associations}, which allows searching for fungus-host combinations.

Maintained by Franz-Sebastian Krah. Last updated 4 years ago.

1.8 match 14 stars 3.54 score 5 scripts

mczyzj

pestr:Interface to Download Data on Pests and Hosts from 'EPPO'

Set of tools to automatize extraction of data on pests from 'EPPO Data Services' and 'EPPO Global Database' and to put them into tables with human readable format. Those function use 'EPPO database API', thus you first need to register on <https://data.eppo.int> (free of charge). Additional helpers allow to download, check and connect to 'SQLite EPPO database'.

Maintained by Michal Jan Czyz. Last updated 6 months ago.

1.2 match 6 stars 5.26 score

bioc

MSA2dist:MSA2dist calculates pairwise distances between all sequences of a DNAStringSet or a AAStringSet using a custom score matrix and conducts codon based analysis

MSA2dist calculates pairwise distances between all sequences of a DNAStringSet or a AAStringSet using a custom score matrix and conducts codon based analysis. It uses scoring matrices to be used in these pairwise distance calcualtions which can be adapted to any scoring for DNA or AA characters. E.g. by using literal distances MSA2dist calculates pairwise IUPAC distances.

Maintained by Kristian K Ullrich. Last updated 4 months ago.

alignment sequencing genetics go cpp

1.2 match 5.02 score 7 scripts 1 dependents

sergeitarasov

ontoFAST:Interactive Annotation of Characters with Biological Ontologies

Tools for annotating characters (character matrices) with anatomical and phenotype ontologies. Includes functions for visualising character annotations and creating simple queries using ontological relationships.

Maintained by Sergei Tarasov. Last updated 3 years ago.

annotations character-matrices characters ontology phylogenetics

1.9 match 2 stars 3.00 score 5 scripts

sciviews

svMisc:Miscellaneous Functions for 'SciViews::R'

Functions required for the 'SciViews::R' dialect or for general use: manage a temporary environment attached to the search path, define synonyms for R functions using aka(), test if 'Aqua', 'Mac', 'Win' ... Show progress bar, etc.

Maintained by Philippe Grosjean. Last updated 4 months ago.

gui sciviews

0.5 match 3 stars 8.35 score 380 scripts 16 dependents

cran

fqacalc:Calculate Floristic Quality Assessment Metrics

A collection of functions for calculating Floristic Quality Assessment (FQA) metrics using regional FQA databases that have been approved or approved with reservations as ecological planning models by the U.S. Army Corps of Engineers (USACE). For information on FQA see Spyreas (2019) <doi:10.1002/ecs2.2825>. These databases are stored in a sister R package, 'fqadata'. Both packages were developed for the USACE by the U.S. Army Engineer Research and Development Center’s Environmental Laboratory.

Maintained by Iris Foxfoot. Last updated 1 years ago.

1.3 match 2.70 score

fwild

lsa:Latent Semantic Analysis

The basic idea of latent semantic analysis (LSA) is, that text do have a higher order (=latent semantic) structure which, however, is obscured by word usage (e.g. through the use of synonyms or polysemy). By using conceptual indices that are derived statistically via a truncated singular value decomposition (a two-mode factor analysis) over a given document-term matrix, this variability problem can be overcome.

Maintained by Fridolin Wild. Last updated 3 years ago.

0.5 match 6.25 score 1.2k scripts 23 dependents

lerouzic

vhica:Vertical and Horizontal Inheritance Consistence Analysis

The "Vertical and Horizontal Inheritance Consistence Analysis" method is described in the following publication: "VHICA: a new method to discriminate between vertical and horizontal transposon transfer: application to the mariner family within Drosophila" by G. Wallau. et al. (2016) <DOI:10.1093/molbev/msv341>. The purpose of the method is to detect horizontal transfers of transposable elements, by contrasting the divergence of transposable element sequences with that of regular genes.

Maintained by Arnaud Le Rouzic. Last updated 2 years ago.

1.9 match 1 stars 1.70 score 6 scripts

stevecondylios

dictionaRy:Retrieve the Dictionary Definitions of English Words

An R interface to the 'Free Dictionary API' <https://dictionaryapi.dev/>, <https://github.com/meetDeveloper/freeDictionaryAPI>. Retrieve dictionary definitions for English words, as well as additional information including phonetics, part of speech, origins, audio pronunciation, example usage, synonyms and antonyms, returned in 'tidy' format for ease of use.

Maintained by Steve Condylios. Last updated 3 years ago.

literature natural-language-processing r-language

0.5 match 6 stars 4.86 score 240 scripts

cran

GenomicSig:Computation of Genomic Signatures

Genomic signatures represent unique features within a species' DNA, enabling the differentiation of species and offering broad applications across various fields. This package provides essential tools for calculating these specific signatures, streamlining the process for researchers and offering a comprehensive and time-saving solution for genomic analysis.The amino acid contents are identified based on the work published by Sandberg et al. (2003) <doi:10.1016/s0378-1119(03)00581-x> and Xiao et al. (2015) <doi:10.1093/bioinformatics/btv042>. The Average Mutual Information Profiles (AMIP) values are calculated based on the work of Bauer et al. (2008) <doi:10.1186/1471-2105-9-48>. The Chaos Game Representation (CGR) plot visualization was done based on the work of Deschavanne et al. (1999) <doi:10.1093/oxfordjournals.molbev.a026048> and Jeffrey et al. (1990) <doi:10.1093/nar/18.8.2163>. The GC content is calculated based on the work published by Nakabachi et al. (2006) <doi:10.1126/science.1134196> and Barbu et al. (1956) <https://pubmed.ncbi.nlm.nih.gov/13363015>. The Oligonucleotide Frequency Derived Error Gradient (OFDEG) values are computed based on the work published by Saeed et al. (2009) <doi:10.1186/1471-2164-10-S3-S10>. The Relative Synonymous Codon Usage (RSCU) values are calculated based on the work published by Elek (2018) <https://urn.nsk.hr/urn:nbn:hr:217:686131>.

Maintained by Anu Sharma. Last updated 6 months ago.

2.4 match 1.00 score

cran

discoverableresearch:Checks Title, Abstract and Keywords to Optimise Discoverability

A suite of tools are provided here to support authors in making their research more discoverable. check_keywords() - this function checks the keywords to assess whether they are already represented in the title and abstract. check_fields() - this function compares terminology used across the title, abstract and keywords to assess where terminological diversity (i.e. the use of synonyms) could increase the likelihood of the record being identified in a search. The function looks for terms in the title and abstract that also exist in other fields and highlights these as needing attention. suggest_keywords() - this function takes a full text document and produces a list of unigrams, bigrams and trigrams (1-, 2- or 2-word phrases) present in the full text after removing stop words (words with a low utility in natural language processing) that do not occur in the title or abstract that may be suitable candidates for keywords. suggest_title() - this function takes a full text document and produces a list of the most frequently used unigrams, bigrams and trigrams after removing stop words that do not occur in the abstract or keywords that may be suitable candidates for title words. check_title() - this function carries out a number of sub tasks: 1) it compares the length (number of words) of the title with the mean length of titles in major bibliographic databases to assess whether the title is likely to be too short; 2) it assesses the proportion of stop words in the title to highlight titles with low utility in search engines that strip out stop words; 3) it compares the title with a given sample of record titles from an .ris import and calculates a similarity score based on phrase overlap. This highlights the level of uniqueness of the title. This version of the package also contains functions currently in a non-CRAN package called 'litsearchr' <https://github.com/elizagrames/litsearchr>.

Maintained by Neal Haddaway. Last updated 4 years ago.

0.5 match 2.70 score

fvidoli

SpatialRegimes:Spatial Constrained Clusterwise Regression

A collection of functions for estimating spatial regimes, aggregations of neighboring spatial units that are homogeneous in functional terms. The term spatial regime, therefore, should not be understood as a synonym for cluster. More precisely, the term cluster does not presuppose any functional relationship between the variables considered, while the term regime is linked to a regressive relationship underlying the spatial process.

Maintained by Francesco Vidoli. Last updated 2 years ago.

0.5 match 2.00 score

peyronlab

aliases2entrez:Converts Human gene symbols to entrez IDs

Queries multiple resources authors HGNC (2019) <https://www.genenames.org>, authors limma (2015) <doi:10.1093/nar/gkv007> to find the correspondence between evolving nomenclature of human gene symbols, aliases, previous symbols or synonyms with stable, curated gene entrezID from NCBI database. This allows fast, accurate and up-to-date correspondence between human gene expression datasets from various date and platform (e.g: gene symbol: BRCA1 - ID: 672).

Maintained by Raphael Bonnet. Last updated 4 years ago.

0.5 match 1 stars 2.00 score 10 scripts

cran

SADEG:Stability Analysis in Differentially Expressed Genes

We analyzed the nucleotide composition of genes with a special emphasis on stability of DNA sequences. Besides, in a variety of different organisms unequal use of synonymous codons, or codon usage bias, occurs which also show variation among genes in the same genome. Seemingly, codon usage bias is affected by both selective constraints and mutation bias which allows and enables us to examine and detect changes in these two evolutionary forces between genomes or along one genome. Therefore, we determined the codon adaptation index (CAI), effective number of codons (ENC) and codon usage analysis with calculation of the relative synonymous codon usage (RSCU), and subsequently predicted the translation efficiency and accuracy through GC-rich codon usages. Furthermore, we estimated the relative stability of the DNA sequence following calculation of the average free energy (Delta G) and Dimer base-stacking energy level.

Maintained by Babak Khorsand. Last updated 8 years ago.

0.8 match 1.00 score

leilamarvian

PreProcessRecordLinkage:Preprocessing Record Linkage

In this record linkage package, data preprocessing has been meticulously executed to cover a wide range of datasets, ensuring that variable names are standardized using synonyms. This approach facilitates seamless data integration and analysis across various datasets. While users have the flexibility to modify variable names, the system intelligently ensures that changes are only permitted when they do not compromise data consistency or essential variable essence.

Maintained by Leila Marvian Mashhad. Last updated 2 years ago.

0.5 match 1.00 score