Showing 200 of total 1017 results (show query)
tiledb-inc
tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays
The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.
Maintained by Isaiah Norton. Last updated 4 days ago.
arrayhdfss3storage-managertiledbcpp
96.8 match 107 stars 11.96 score 306 scripts 4 dependentsr-dbi
DBI:R Database Interface
A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.
Maintained by Kirill Müller. Last updated 3 months ago.
23.0 match 302 stars 20.88 score 19k scripts 2.9k dependentspecanproject
PEcAn.DB:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 2 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
35.6 match 216 stars 11.88 score 127 scripts 27 dependentsropensci
osmdata:Import 'OpenStreetMap' Data as Simple Features or Spatial Objects
Download and import of 'OpenStreetMap' ('OSM') data as 'sf' or 'sp' objects. 'OSM' data are extracted from the 'Overpass' web server (<https://overpass-api.de/>) and processed with very fast 'C++' routines for return to 'R'.
Maintained by Mark Padgham. Last updated 1 months ago.
open0street0mapopenstreetmapoverpass0apiosmcpposm-dataoverpass-apipeer-reviewedcpp
27.6 match 322 stars 14.53 score 2.8k scripts 14 dependentspharmaverse
admiral:ADaM in R Asset Library
A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).
Maintained by Ben Straub. Last updated 4 days ago.
cdiscclinical-trialsopen-source
28.4 match 236 stars 13.89 score 486 scripts 4 dependentsstevenmmortimer
salesforcer:An Implementation of 'Salesforce' APIs Using Tidy Principles
Functions connecting to the 'Salesforce' Platform APIs (REST, SOAP, Bulk 1.0, Bulk 2.0, Metadata, Reports and Dashboards) <https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_overview>. "API" is an acronym for "application programming interface". Most all calls from these APIs are supported as they use CSV, XML or JSON data that can be parsed into R data structures. For more details please see the 'Salesforce' API documentation and this package's website <https://stevenmmortimer.github.io/salesforcer/> for more information, documentation, and examples.
Maintained by Steven M. Mortimer. Last updated 4 months ago.
api-wrappersr-languager-programmingsalesforcesalesforce-apis
38.1 match 82 stars 9.27 score 191 scriptsropensci
redland:RDF Library Bindings in R
Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at <https://www.w3.org/TR/rdf-primer/>. This package supports RDF by implementing an R interface to the Redland RDF C library, described at <https://librdf.org/docs/api/index.html>. In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.
Maintained by Matthew B. Jones. Last updated 1 years ago.
43.9 match 17 stars 7.85 score 98 scripts 13 dependentsohdsi
DatabaseConnector:Connecting to Various Database Platforms
An R 'DataBase Interface' ('DBI') compatible interface to various database platforms ('PostgreSQL', 'Oracle', 'Microsoft SQL Server', 'Amazon Redshift', 'Microsoft Parallel Database Warehouse', 'IBM Netezza', 'Apache Impala', 'Google BigQuery', 'Snowflake', 'Spark', 'SQLite', and 'InterSystems IRIS'). Also includes support for fetching data as 'Andromeda' objects. Uses either 'Java Database Connectivity' ('JDBC') or other 'DBI' drivers to connect to databases.
Maintained by Martijn Schuemie. Last updated 2 months ago.
26.3 match 56 stars 12.63 score 772 scripts 11 dependentsmiraisolutions
XLConnect:Excel Connector for R
Provides comprehensive functionality to read, write and format Excel data.
Maintained by Martin Studer. Last updated 18 days ago.
cross-platformexcelr-languagexlconnectopenjdk
25.2 match 130 stars 12.28 score 1.2k scripts 1 dependentsrpolars
polars:Lightning-Fast 'DataFrame' Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Soren Welling. Last updated 3 days ago.
25.8 match 499 stars 12.01 score 1.0k scripts 2 dependentshms-dbmi
UpSetR:A More Scalable Alternative to Venn and Euler Diagrams for Visualizing Intersecting Sets
Creates visualizations of intersecting sets using a novel matrix design, along with visualizations of several common set, element and attribute related tasks (Conway 2017) <doi:10.1093/bioinformatics/btx364>.
Maintained by Jake Conway. Last updated 4 years ago.
gehlenborglabggplot2upsetupsetrvisualization
18.8 match 781 stars 15.33 score 4.8k scripts 42 dependentsmicrosoft
wpa:Tools for Analysing and Visualising Viva Insights Data
Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.
Maintained by Martin Chan. Last updated 4 months ago.
41.9 match 30 stars 6.69 score 39 scripts 1 dependentsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 12 hours ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
12.0 match 582 stars 21.11 score 31k scripts 1.9k dependentsaphalo
photobiology:Photobiological Calculations
Definitions of classes, methods, operators and functions for use in photobiology and radiation meteorology and climatology. Calculation of effective (weighted) and not-weighted irradiances/doses, fluence rates, transmittance, reflectance, absorptance, absorbance and diverse ratios and other derived quantities from spectral data. Local maxima and minima: peaks, valleys and spikes. Conversion between energy-and photon-based units. Wavelength interpolation. Astronomical calculations related solar angles and day length. Colours and vision. This package is part of the 'r4photobiology' suite, Aphalo, P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.
Maintained by Pedro J. Aphalo. Last updated 3 days ago.
lightphotobiologyquantificationr4photobiology-suiteradiationspectrasun-position
25.8 match 4 stars 9.35 score 604 scripts 12 dependentsbioc
GenomicDataCommons:NIH / NCI Genomic Data Commons Access
Programmatically access the NIH / NCI Genomic Data Commons RESTful service.
Maintained by Sean Davis. Last updated 1 months ago.
dataimportsequencingapi-clientbioconductorbioinformaticscancercore-servicesdata-sciencegenomicsncitcgavignette
19.7 match 87 stars 11.94 score 238 scripts 12 dependentsappelmar
gdalcubes:Earth Observation Data Cubes from Satellite Image Collections
Processing collections of Earth observation images as on-demand multispectral, multitemporal raster data cubes. Users define cubes by spatiotemporal extent, resolution, and spatial reference system and let 'gdalcubes' automatically apply cropping, reprojection, and resampling using the 'Geospatial Data Abstraction Library' ('GDAL'). Implemented functions on data cubes include reduction over space and time, applying arithmetic expressions on pixel band values, moving window aggregates over time, filtering by space, time, bands, and predicates on pixel values, exporting data cubes as 'netCDF' or 'GeoTIFF' files, plotting, and extraction from spatial and or spatiotemporal features. All computational parts are implemented in C++, linking to the 'GDAL', 'netCDF', 'CURL', and 'SQLite' libraries. See Appel and Pebesma (2019) <doi:10.3390/data4030092> for further details.
Maintained by Marius Appel. Last updated 1 years ago.
remote-sensingsatellite-imageryspatial-analysisgdalnetcdfcpp
27.9 match 124 stars 8.39 score 356 scriptsinbo
inbodb:Connect to and Retrieve Data from Databases on the INBO Server
A bundle of functions to connect to and retrieve data from databases on the INBO server, with dedicated functions to query some of these databases.
Maintained by Els Lommelen. Last updated 25 days ago.
37.4 match 6.16 score 114 scripts 1 dependentsdataoneorg
dataone:R Interface to the DataONE REST API
Provides read and write access to data and metadata from the DataONE network <https://www.dataone.org> of data repositories. Each DataONE repository implements a consistent repository application programming interface. Users call methods in R to access these remote repository functions, such as methods to query the metadata catalog, get access to metadata for particular data packages, and read the data objects from the data repository. Users can also insert and update data objects on repositories that support these methods.
Maintained by Matthew B. Jones. Last updated 3 years ago.
20.2 match 36 stars 9.93 score 472 scripts 3 dependentsflippiecoetser
Query:Write SQL Statements with ease
This package provides a set of utility functions to efficiently write SQL Statements: In essence converting R to SQL.
Maintained by Flippie Coetser. Last updated 1 years ago.
52.0 match 2 stars 3.73 score 179 scripts 1 dependentsr-lib
desc:Manipulate DESCRIPTION Files
Tools to read, write, create, and manipulate DESCRIPTION files. It is intended for packages that create or manipulate other packages.
Maintained by Gábor Csárdi. Last updated 1 months ago.
12.9 match 123 stars 14.68 score 409 scripts 1.1k dependentscbiit
LDlinkR:Calculating Linkage Disequilibrium (LD) in Human Population Groups of Interest
Provides access to the 'LDlink' API (<https://ldlink.nih.gov/?tab=apiaccess>) using the R console. This programmatic access facilitates researchers who are interested in performing batch queries in 1000 Genomes Project (2015) <doi:10.1038/nature15393> data using 'LDlink'. 'LDlink' is an interactive and powerful suite of web-based tools for querying germline variants in human population groups of interest. For more details, please see Machiela et al. (2015) <doi:10.1093/bioinformatics/btv402>.
Maintained by Timothy A. Myers. Last updated 11 months ago.
ld-calculatorldlinkldlink-apildlink-webtoollinkage-disequilibriumpopulation-genetics
18.8 match 58 stars 9.21 score 206 scripts 1 dependentsr-lib
ps:List, Query, Manipulate System Processes
List, query and manipulate all system processes, on 'Windows', 'Linux' and 'macOS'.
Maintained by Gábor Csárdi. Last updated 16 days ago.
11.3 match 79 stars 15.09 score 108 scripts 1.5k dependentsgreat-northern-diver
loon:Interactive Statistical Data Visualization
An extendable toolkit for interactive data visualization and exploration.
Maintained by R. Wayne Oldford. Last updated 2 years ago.
data-analysisdata-sciencedata-visualizationexploratory-analysisexploratory-data-analysishigh-dimensional-datainteractive-graphicsinteractive-visualizationsloonpythonstatistical-analysisstatistical-graphicsstatisticstcl-extensiontk
18.9 match 48 stars 9.00 score 93 scripts 5 dependentskasperwelbers
corpustools:Managing, Querying and Analyzing Tokenized Text
Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.
Maintained by Kasper Welbers. Last updated 6 months ago.
22.5 match 31 stars 7.50 score 174 scripts 1 dependentsintegrated-inferences
CausalQueries:Make, Update, and Query Binary Causal Models
Users can declare causal models over binary nodes, update beliefs about causal types given data, and calculate arbitrary queries. Updating is implemented in 'stan'. See Humphreys and Jacobs, 2023, Integrated Inferences (<DOI: 10.1017/9781316718636>) and Pearl, 2009 Causality (<DOI:10.1017/CBO9780511803161>).
Maintained by Till Tietz. Last updated 22 days ago.
bayescausaldagsmixedmethodsstancpp
18.2 match 27 stars 9.03 score 54 scriptscivisanalytics
civis:R Client for the 'Civis Platform API'
A convenient interface for making requests directly to the 'Civis Platform API' <https://www.civisanalytics.com/platform/>. Full documentation available 'here' <https://civisanalytics.github.io/civis-r/>.
Maintained by Peter Cooman. Last updated 2 months ago.
20.4 match 16 stars 7.84 score 144 scriptsjessecambon
tidygeocoder:Geocoding Made Easy
An intuitive interface for getting data from geocoding services.
Maintained by Jesse Cambon. Last updated 4 months ago.
13.9 match 287 stars 11.35 score 1.0k scripts 9 dependentsmrcieu
gwasvcf:Tools for Dealing with GWAS Summary Data in VCF Format
Tools for dealing with GWAS summary data in VCF format. Includes reading, querying, writing, as well as helper functions such as LD proxy searches.
Maintained by Gibran Hemani. Last updated 2 years ago.
27.9 match 77 stars 5.65 score 129 scripts 1 dependentsbioc
annotate:Annotation for microarrays
Using R enviroments for annotation.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
13.6 match 11.41 score 812 scripts 243 dependentsbioc
GenomicDistributions:GenomicDistributions: fast analysis of genomic intervals with Bioconductor
If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.
Maintained by Kristyna Kupkova. Last updated 5 months ago.
softwaregenomeannotationgenomeassemblydatarepresentationsequencingcoveragefunctionalgenomicsvisualization
20.9 match 26 stars 7.44 score 25 scriptsdarwin-eu
PatientProfiles:Identify Characteristics of Patients in the OMOP Common Data Model
Identify the characteristics of patients in data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model.
Maintained by Marti Catala. Last updated 9 days ago.
15.3 match 1 stars 9.97 score 225 scripts 9 dependentsmrc-ide
orderly2:Orderly Next Generation
Distributed reproducible computing framework, adopting ideas from git, docker and other software. By defining a lightweight interface around the inputs and outputs of an analysis, a lot of the repetitive work for reproducible research can be automated. We define a simple format for organising and describing work that facilitates collaborative reproducible research and acknowledges that all analyses are run multiple times over their lifespans.
Maintained by Rich FitzJohn. Last updated 2 months ago.
18.1 match 8 stars 8.30 score 49 scripts 2 dependentsdoi-usgs
sbtools:USGS ScienceBase Tools
Tools for interacting with U.S. Geological Survey ScienceBase <https://www.sciencebase.gov> interfaces. ScienceBase is a data cataloging and collaborative data management platform. Functions included for querying ScienceBase, and creating and fetching datasets.
Maintained by David Blodgett. Last updated 10 months ago.
18.7 match 21 stars 7.94 score 127 scripts 2 dependentsbioc
OncoScore:A tool to identify potentially oncogenic genes
OncoScore is a tool to measure the association of genes to cancer based on citation frequencies in biomedical literature. The score is evaluated from PubMed literature by dynamically updatable web queries.
Maintained by Luca De Sano. Last updated 5 months ago.
24.1 match 5 stars 6.15 score 2 scriptsbioc
loci2path:Loci2path: regulatory annotation of genomic intervals based on tissue-specific expression QTLs
loci2path performs statistics-rigorous enrichment analysis of eQTLs in genomic regions of interest. Using eQTL collections provided by the Genotype-Tissue Expression (GTEx) project and pathway collections from MSigDB.
Maintained by Tianlei Xu. Last updated 5 months ago.
functionalgenomicsgeneticsgenesetenrichmentsoftwaregeneexpressionsequencingcoveragebiocarta
34.3 match 1 stars 4.30 score 2 scriptspolmine
polmineR:Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Maintained by Andreas Blaette. Last updated 1 years ago.
18.3 match 49 stars 7.96 score 311 scriptsrblp
Rblpapi:R Interface to 'Bloomberg'
An R Interface to 'Bloomberg' is provided via the 'Blp API'.
Maintained by Dirk Eddelbuettel. Last updated 4 days ago.
15.4 match 169 stars 9.43 score 115 scriptsmgondan
rolog:Query 'SWI'-'Prolog' from R
This R package connects to SWI-Prolog, <https://www.swi-prolog.org/>, so that R can send deterministic and non-deterministic queries to prolog (consult, query/submit, once, findall).
Maintained by Matthias Gondan. Last updated 6 days ago.
22.8 match 4 stars 6.37 score 10 scripts 1 dependentsropensci
biomartr:Genomic Data Retrieval
Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.
Maintained by Hajk-Georg Drost. Last updated 1 months ago.
biomartgenomic-data-retrievalannotation-retrievaldatabase-retrievalncbiensemblbiological-data-retrievalensembl-serversgenomegenome-annotationgenome-retrievalgenomicsmeta-analysismetagenomicsncbi-genbankpeer-reviewedproteomesequenced-genomes
12.7 match 218 stars 11.35 score 129 scripts 3 dependentsluukvdmeer
sfnetworks:Tidy Geospatial Networks
Provides a tidy approach to spatial network analysis, in the form of classes and functions that enable a seamless interaction between the network analysis package 'tidygraph' and the spatial analysis package 'sf'.
Maintained by Lucas van der Meer. Last updated 3 months ago.
geospatial-networksnetwork-analysisrspatialsimple-featuresspatial-analysisspatial-data-sciencespatial-networkstidygraphtidyverse
14.8 match 372 stars 9.63 score 332 scripts 6 dependentsgagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 1 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
7.8 match 309 stars 18.31 score 10k scripts 8.6k dependentsmicrosoft
vivainsights:Analyze and Visualize Data from 'Microsoft Viva Insights'
Provides a versatile range of functions, including exploratory data analysis, time-series analysis, organizational network analysis, and data validation, whilst at the same time implements a set of best practices in analyzing and visualizing data specific to 'Microsoft Viva Insights'.
Maintained by Martin Chan. Last updated 24 days ago.
23.0 match 11 stars 6.12 score 68 scriptsncss-tech
soilDB:Soil Database Interface
A collection of functions for reading soil data from U.S. Department of Agriculture Natural Resources Conservation Service (USDA-NRCS) and National Cooperative Soil Survey (NCSS) databases.
Maintained by Andrew Brown. Last updated 7 days ago.
ksslnasisnrcssoilsoil-data-accesssoil-surveysoilwebsqlusda
12.3 match 87 stars 11.34 score 1.0k scripts 1 dependentsapache
apache.sedona:R Interface for Apache Sedona
R interface for 'Apache Sedona' based on 'sparklyr' (<https://sedona.apache.org>).
Maintained by Apache Sedona. Last updated 15 hours ago.
cluster-computinggeospatialjavapythonscalaspatial-analysisspatial-queryspatial-sql
13.0 match 2.0k stars 10.72 score 105 scriptswinvector
rquery:Relational Query Generator for Data Manipulation at Scale
A piped query generator based on Edgar F. Codd's relational algebra, and on production experience using 'SQL' and 'dplyr' at big data scale. The design represents an attempt to make 'SQL' more teachable by denoting composition by a sequential pipeline notation instead of nested queries or functions. The implementation delivers reliable high performance data processing on large data systems such as 'Spark', databases, and 'data.table'. Package features include: data processing trees or pipelines as observable objects (able to report both columns produced and columns used), optimized 'SQL' generation as an explicit user visible table modeling step, plus explicit query reasoning and checking.
Maintained by John Mount. Last updated 2 years ago.
14.4 match 110 stars 9.53 score 126 scripts 3 dependentsdwulff
text2sdg:Detecting UN Sustainable Development Goals in Text
The United Nations’ Sustainable Development Goals (SDGs) have become an important guideline for organisations to monitor and plan their contributions to social, economic, and environmental transformations. The 'text2sdg' package is an open-source analysis package that identifies SDGs in text using scientifically developed query systems, opening up the opportunity to monitor any type of text-based data, such as scientific output or corporate publications. For more information regarding the methodology see Meier, Mata & Wulff (2022) <arXiv:2110.05856>.
Maintained by Dominik S. Meier. Last updated 6 months ago.
natural-language-processingsustainabilitysustainable-developmentsustainable-development-goals
22.0 match 18 stars 6.13 score 9 scriptsphippsy
brandwatchR:'Brandwatch' API to R
Interact with the 'Brandwatch' API <https://developers.brandwatch.com/docs>. Allows you to authenticate to the API and obtain data for projects, queries, query groups tags and categories. Also allows you to directly obtain mentions and aggregate data for a specified query or query group.
Maintained by Donal Phipps. Last updated 7 years ago.
31.9 match 11 stars 4.16 score 26 scriptsusepa
tcpl:ToxCast Data Analysis Pipeline
The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.
Maintained by Jason Brown. Last updated 3 days ago.
13.7 match 36 stars 9.41 score 90 scriptsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 1 days ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
7.0 match 559 stars 17.64 score 17k scripts 851 dependentsmountainmath
cancensus:Access, Retrieve, and Work with Canadian Census Data and Geography
Integrated, convenient, and uniform access to Canadian Census data and geography retrieved using the 'CensusMapper' API. This package produces analysis-ready tidy data frames and spatial data in multiple formats, as well as convenience functions for working with Census variables, variable hierarchies, and region selection. API keys are freely available with free registration at <https://censusmapper.ca/api>. Census data and boundary geometries are reproduced and distributed on an "as is" basis with the permission of Statistics Canada (Statistics Canada 2001; 2006; 2011; 2016; 2021).
Maintained by Dmitry Shkolnik. Last updated 1 years ago.
14.0 match 82 stars 8.80 score 414 scriptsatlasoflivingaustralia
galah:Biodiversity Data from the GBIF Node Network
The Global Biodiversity Information Facility ('GBIF', <https://www.gbif.org>) sources data from an international network of data providers, known as 'nodes'. Several of these nodes - the "living atlases" (<https://living-atlases.gbif.org>) - maintain their own web services using software originally developed by the Atlas of Living Australia ('ALA', <https://www.ala.org.au>). 'galah' enables the R community to directly access data and resources hosted by 'GBIF' and its partner nodes.
Maintained by Martin Westgate. Last updated 1 months ago.
13.4 match 43 stars 9.17 score 275 scripts 1 dependentsidigbio
ridigbio:Interface to the iDigBio Data API
An interface to iDigBio's search API that allows downloading specimen records. Searches are returned as a data.frame. Other functions such as the metadata end points return lists of information. iDigBio is a US project focused on digitizing and serving museum specimen collections on the web. See <https://www.idigbio.org> for information on iDigBio.
Maintained by Jesse Bennett. Last updated 5 days ago.
12.0 match 16 stars 10.23 score 63 scripts 7 dependentsianmcook
tidyquery:Query 'R' Data Frames with 'SQL'
Use 'SQL' 'SELECT' statements to query 'R' data frames.
Maintained by Ian Cook. Last updated 2 years ago.
20.1 match 168 stars 5.95 score 35 scriptsthomasp85
tidygraph:A Tidy API for Graph Manipulation
A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.
Maintained by Thomas Lin Pedersen. Last updated 1 months ago.
graph-algorithmsgraph-manipulationigraphnetwork-analysistidyversecpp
8.0 match 553 stars 14.74 score 4.6k scripts 136 dependentsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 27 days ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
8.0 match 305 stars 14.45 score 1.6k scripts 6 dependentsazure
AzureKusto:Interface to 'Kusto'/'Azure Data Explorer'
An interface to 'Azure Data Explorer', also known as 'Kusto', a fast, distributed data exploration service from Microsoft: <https://azure.microsoft.com/en-us/products/data-explorer/>. Includes 'DBI' and 'dplyr' interfaces, with the latter modelled after the 'dbplyr' package, whereby queries are translated from R into the native 'KQL' query language and executed lazily. On the admin side, the package extends the object framework provided by 'AzureRMR' to support creation and deletion of databases, and management of database principals. Part of the 'AzureR' family of packages.
Maintained by Alex Kyllo. Last updated 1 years ago.
azureazure-data-explorerazure-sdk-rbig-data-analyticskusto
22.0 match 18 stars 5.19 score 9 scriptsrdatatable
data.table:Extension of `data.frame`
Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
Maintained by Tyson Barrett. Last updated 2 days ago.
4.8 match 3.7k stars 23.53 score 230k scripts 4.6k dependentsjlmelville
rnndescent:Nearest Neighbor Descent Method for Approximate Nearest Neighbors
The Nearest Neighbor Descent method for finding approximate nearest neighbors by Dong and co-workers (2010) <doi:10.1145/1963405.1963487>. Based on the 'Python' package 'PyNNDescent' <https://github.com/lmcinnes/pynndescent>.
Maintained by James Melville. Last updated 8 months ago.
approximate-nearest-neighbor-searchcpp
15.4 match 11 stars 7.31 score 75 scriptsr-lib
pkgcache:Cache 'CRAN'-Like Metadata and R Packages
Metadata and package cache for CRAN-like repositories. This is a utility package to be used by package management tools that want to take advantage of caching.
Maintained by Gábor Csárdi. Last updated 17 days ago.
12.6 match 28 stars 8.90 score 31 scripts 7 dependentsbioc
scDiagnostics:Cell type annotation diagnostics
The scDiagnostics package provides diagnostic plots to assess the quality of cell type assignments from single cell gene expression profiles. The implemented functionality allows to assess the reliability of cell type annotations, investigate gene expression patterns, and explore relationships between different cell types in query and reference datasets allowing users to detect potential misalignments between reference and query datasets. The package also provides visualization capabilities for diagnostics purposes.
Maintained by Anthony Christidis. Last updated 5 months ago.
annotationclassificationclusteringgeneexpressionrnaseqsinglecellsoftwaretranscriptomics
14.0 match 8 stars 7.77 score 46 scriptsbioc
snapcount:R/Bioconductor Package for interfacing with Snaptron for rapid querying of expression counts
snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).
Maintained by Rone Charles. Last updated 5 months ago.
coveragegeneexpressionrnaseqsequencingsoftwaredataimport
20.4 match 3 stars 5.19 score 13 scriptsgroditi
blsR:Make Requests from the Bureau of Labor Statistics API
Implements v2 of the B.L.S. API for requests of survey information and time series data through 3-tiered API that allows users to interact with the raw API directly, create queries through a functional interface, and re-shape the data structures returned to fit common uses. The API definition is located at: <https://www.bls.gov/developers/api_signature_v2.htm>.
Maintained by Guillermo Roditi Dominguez. Last updated 1 years ago.
23.4 match 14 stars 4.45 score 40 scriptstidyverse
dbplyr:A 'dplyr' Back End for Databases
A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.
Maintained by Hadley Wickham. Last updated 3 months ago.
5.3 match 481 stars 19.72 score 5.2k scripts 736 dependentscjbarrie
academictwitteR:Access the Twitter Academic Research Product Track V2 API Endpoint
Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side.
Maintained by Christopher Barrie. Last updated 2 years ago.
11.4 match 275 stars 8.94 score 177 scriptsshichenxie
pedquant:Public Economic Data and Quantitative Analysis
Provides an interface to access public economic and financial data for economic research and quantitative analysis. The data sources including NBS, FRED, Sina, Eastmoney and etc. It also provides quantitative functions for trading strategies based on the 'data.table', 'TTR', 'PerformanceAnalytics' and etc packages.
Maintained by Shichen Xie. Last updated 2 days ago.
17.3 match 59 stars 5.70 score 34 scriptsskoval
RISmed:Download Content from NCBI Databases
A set of tools to extract bibliographic content from the National Center for Biotechnology Information (NCBI) databases, including PubMed. The name RISmed is a portmanteau of RIS (for Research Information Systems, a common tag format for bibliographic data) and PubMed.
Maintained by Stephanie Kovalchik. Last updated 3 years ago.
13.9 match 38 stars 6.94 score 252 scripts 3 dependentsr-lib
pak:Another Approach to Package Installation
The goal of 'pak' is to make package installation faster and more reliable. In particular, it performs all HTTP operations in parallel, so metadata resolution and package downloads are fast. Metadata and package files are cached on the local disk as well. 'pak' has a dependency solver, so it finds version conflicts before performing the installation. This version of 'pak' supports CRAN, 'Bioconductor' and 'GitHub' packages as well.
Maintained by Gábor Csárdi. Last updated 14 hours ago.
7.3 match 717 stars 13.05 score 277 scripts 17 dependentsr-dbi
RSQLite:SQLite Interface for R
Embeds the SQLite database engine in R and provides an interface compliant with the DBI package. The source for the SQLite engine and for various extensions in a recent version is included. System libraries will never be consulted because this package relies on static linking for the plugins it includes; this also ensures a consistent experience across all installations.
Maintained by Kirill Müller. Last updated 25 days ago.
4.8 match 327 stars 18.73 score 8.1k scripts 1.1k dependentsdatawookie
emayili:Send Email Messages
A light, simple tool for sending emails with minimal dependencies.
Maintained by Andrew B. Collier. Last updated 1 months ago.
9.3 match 180 stars 9.59 score 95 scripts 3 dependentsmrcieu
ieugwasr:Interface to the 'OpenGWAS' Database API
Interface to the 'OpenGWAS' database API <https://api.opengwas.io/api/>. Includes a wrapper to make generic calls to the API, plus convenience functions for specific queries.
Maintained by Gibran Hemani. Last updated 3 days ago.
8.2 match 89 stars 10.71 score 404 scripts 6 dependentsbruigtp
REDCapDM:'REDCap' Data Management
REDCap Data Management - REDCapDM is an R package that allows users to manage data exported directly from REDCap or using an API connection. This package includes several functions designed for pre-processing data, generating reports of queries such as outliers or missing values, and following up on the identified queries. 'REDCap' (Research Electronic Data CAPture; <https://projectredcap.org>) is a web application developed at Vanderbilt University, designed for creating and managing online surveys and databases and the REDCap API is an interface that allows external applications to connect to REDCap remotely, and is used to programmatically retrieve or modify project data or settings within REDCap, such as importing or exporting data.
Maintained by João Carmezim. Last updated 2 days ago.
14.6 match 4 stars 5.89 score 9 scriptsbioc
recountmethylation:Access and analyze public DNA methylation array data compilations
Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.
Maintained by Sean K Maden. Last updated 5 months ago.
dnamethylationepigeneticsmicroarraymethylationarrayexperimenthub
13.6 match 9 stars 6.28 score 9 scriptsropensci
rnassqs:Access Data from the NASS 'Quick Stats' API
Interface to access data via the United States Department of Agriculture's National Agricultural Statistical Service (NASS) 'Quick Stats' web API <https://quickstats.nass.usda.gov/api/>. Convenience functions facilitate building queries based on available parameters and valid parameter values. This product uses the NASS API but is not endorsed or certified by NASS.
Maintained by Nicholas Potter. Last updated 7 months ago.
11.3 match 47 stars 7.49 score 63 scripts 1 dependentsdami82
easyPubMed:Search and Retrieve Scientific Publication Records from PubMed
Query NCBI Entrez and retrieve PubMed records in XML or text format. Process PubMed records by extracting and aggregating data from selected fields. A large number of records can be easily downloaded via this simple-to-use interface to the NCBI PubMed API.
Maintained by Damiano Fantini. Last updated 1 years ago.
10.8 match 21 stars 7.83 score 178 scripts 4 dependentssimonpcouch
anyflights:Query 'nycflights13'-Like Air Travel Data for Given Years and Airports
Supplies a set of functions to query air travel data for user- specified years and airports. Datasets include on-time flights, airlines, airports, planes, and weather.
Maintained by Simon P. Couch. Last updated 2 months ago.
14.1 match 49 stars 5.90 score 23 scriptsdavisvaughan
treesitter:Bindings to 'Tree-Sitter'
Provides bindings to 'Tree-sitter', an incremental parsing system for programming tools. 'Tree-sitter' builds concrete syntax trees for source files of any language, and can efficiently update those syntax trees as the source file is edited. It also includes a robust error recovery system that provides useful parse results even in the presence of syntax errors.
Maintained by Davis Vaughan. Last updated 6 months ago.
12.5 match 37 stars 6.62 score 18 scripts 2 dependentsbcgov
bcdata:Search and Retrieve Data from the BC Data Catalogue
Search, query, and download tabular and 'geospatial' data from the British Columbia Data Catalogue (<https://catalogue.data.gov.bc.ca/>). Search catalogue data records based on keywords, data licence, sector, data format, and B.C. government organization. View metadata directly in R, download many data formats, and query 'geospatial' data available via the B.C. government Web Feature Service ('WFS') using 'dplyr' syntax.
Maintained by Andy Teucher. Last updated 1 months ago.
8.0 match 83 stars 10.29 score 186 scripts 4 dependentsropensci
rdflib:Tools to Manipulate and Query Semantic Data
The Resource Description Framework, or 'RDF' is a widely used data representation model that forms the cornerstone of the Semantic Web. 'RDF' represents data as a graph rather than the familiar data table or rectangle of relational databases. The 'rdflib' package provides a friendly and concise user interface for performing common tasks on 'RDF' data, such as reading, writing and converting between the various serializations of 'RDF' data, including 'rdfxml', 'turtle', 'nquads', 'ntriples', and 'json-ld'; creating new 'RDF' graphs, and performing graph queries using 'SPARQL'. This package wraps the low level 'redland' R package which provides direct bindings to the 'redland' C library. Additionally, the package supports the newer and more developer friendly 'JSON-LD' format through the 'jsonld' package. The package interface takes inspiration from the Python 'rdflib' library.
Maintained by Carl Boettiger. Last updated 7 months ago.
8.5 match 57 stars 9.59 score 123 scripts 7 dependentsowp-spatial
hfsubsetR:Hydrofabric Subsetter
Subset Hydrofabric Data in R.
Maintained by Mike Johnson. Last updated 24 days ago.
geospatialhydrofabricnextgennoaa-owpsubsetting
20.1 match 7 stars 4.02 score 8 scriptsbioc
ensembldb:Utilities to create and use Ensembl-based annotation databases
The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.
Maintained by Johannes Rainer. Last updated 5 months ago.
geneticsannotationdatasequencingcoverageannotationbioconductorbioconductor-packagesensembl
5.6 match 35 stars 14.08 score 892 scripts 108 dependentsknausb
vcfR:Manipulate and Visualize VCF Data
Facilitates easy manipulation of variant call format (VCF) data. Functions are provided to rapidly read from and write to VCF files. Once VCF data is read into R a parser function extracts matrices of data. This information can then be used for quality control or other purposes. Additional functions provide visualization of genomic data. Once processing is complete data may be written to a VCF file (*.vcf.gz). It also may be converted into other popular R objects (e.g., genlight, DNAbin). VcfR provides a link between VCF data and familiar R software.
Maintained by Brian J. Knaus. Last updated 23 days ago.
genomicspopulation-geneticspopulation-genomicsrcppvcf-datavisualizationzlibcpp
5.8 match 254 stars 13.59 score 3.1k scripts 19 dependentsr-lib
fs:Cross-Platform File System Operations Based on 'libuv'
A cross-platform interface to file system operations, built on top of the 'libuv' C library.
Maintained by Gábor Csárdi. Last updated 4 months ago.
3.9 match 370 stars 20.26 score 8.1k scripts 5.2k dependentsdyfanjones
noctua:Connect to 'AWS Athena' using R 'AWS SDK' 'paws' ('DBI' Interface)
Designed to be compatible with the 'R' package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this the 'R' 'AWS' Software Development Kit ('SDK') 'paws' <https://github.com/paws-r/paws> is used as a driver.
Maintained by Dyfan Jones. Last updated 11 months ago.
10.4 match 46 stars 7.48 score 58 scriptsvoisinneg
queryup:Query the 'UniProtKB' REST API
Retrieve protein information from the 'UniProtKB' REST API (see <https://www.uniprot.org/help/api_queries>).
Maintained by Guillaume Voisinne. Last updated 2 years ago.
18.0 match 4 stars 4.30 score 7 scriptsrstudio
shiny:Web Application Framework for R
Makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive prebuilt widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.
Maintained by Winston Chang. Last updated 13 days ago.
reactiverstudioshinyweb-appweb-development
3.5 match 5.4k stars 21.28 score 108k scripts 1.8k dependentsropensci
webchem:Chemical Information from the Web
Chemical information from around the web. This package interacts with a suite of web services for chemical information. Sources include: Alan Wood's Compendium of Pesticide Common Names, Chemical Identifier Resolver, ChEBI, Chemical Translation Service, ChemSpider, ETOX, Flavornet, NIST Chemistry WebBook, OPSIN, PubChem, SRS, Wikidata.
Maintained by Tamás Stirling. Last updated 3 months ago.
cas-numberchemical-informationchemspideridentifierropensciwebscraping
7.3 match 165 stars 10.31 score 173 scripts 10 dependentsropensci
comtradr:Interface with the United Nations Comtrade API
Interface with and extract data from the United Nations 'Comtrade' API <https://comtradeplus.un.org/>. 'Comtrade' provides country level shipping data for a variety of commodities, these functions allow for easy API query and data returned as a tidy data frame.
Maintained by Paul Bochtler. Last updated 4 months ago.
apicomtradepeer-reviewedsupply-chain
8.6 match 66 stars 8.67 score 70 scriptsdyfanjones
RAthena:Connect to 'AWS Athena' using 'Boto3' ('DBI' Interface)
Designed to be compatible with the R package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this 'Python' 'Boto3' Software Development Kit ('SDK') <https://boto3.amazonaws.com/v1/documentation/api/latest/index.html> is used as a driver.
Maintained by Dyfan Jones. Last updated 1 years ago.
10.4 match 37 stars 7.10 score 38 scriptsropensci
opencage:Geocode with the OpenCage API
Geocode with the OpenCage API, either from place name to longitude and latitude (forward geocoding) or from longitude and latitude to the name and address of a location (reverse geocoding), see <https://opencagedata.com>.
Maintained by Daniel Possenriede. Last updated 2 months ago.
geocodegeocoderopencageopencage-apiopencage-geocoderpeer-reviewedplacenamesrspatial
8.8 match 87 stars 8.39 score 79 scriptsr-lib
gitcreds:Query 'git' Credentials from 'R'
Query, set, delete credentials from the 'git' credential store. Manage 'GitHub' tokens and other 'git' credentials. This package is to be used by other packages that need to authenticate to 'GitHub' and/or other 'git' repositories.
Maintained by Gábor Csárdi. Last updated 7 months ago.
credentialscredentials-helpergitgithub
5.5 match 28 stars 13.28 score 372 scripts 405 dependentsdarwin-eu
CDMConnector:Connect to an OMOP Common Data Model
Provides tools for working with observational health data in the Observational Medical Outcomes Partnership (OMOP) Common Data Model format with a pipe friendly syntax. Common data model database table references are stored in a single compound object along with metadata.
Maintained by Adam Black. Last updated 18 days ago.
6.4 match 12 stars 11.39 score 502 scripts 12 dependentsrfhb
ctrdata:Retrieve and Analyze Clinical Trials Data from Public Registers
A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', <https://www.clinicaltrialsregister.eu/>), 'ClinicalTrials.gov' (<https://clinicaltrials.gov/> and also translating queries the retired classic interface), the 'ISRCTN' (<http://www.isrctn.com/>) and the 'European Union Clinical Trials Information System' ('CTIS', <https://euclinicaltrials.eu/>). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Protocols, statistical analysis plans, informed consent sheets and other documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for monitoring, meta- and trend-analysis of the design and conduct as well as of the results of clinical trials across registers.
Maintained by Ralf Herold. Last updated 10 hours ago.
clinical-dataclinical-researchclinical-studiesclinical-trialsctgovdatabaseduckdbmongodbnodbipostgresqlregistersqlitestudiestrial
9.1 match 45 stars 7.92 score 32 scriptsbioc
regutools:regutools: an R package for data extraction from RegulonDB
RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.
Maintained by Joselyn Chavez. Last updated 3 months ago.
generegulationgeneexpressionsystemsbiologynetworknetworkinferencevisualizationtranscriptionbioconductorcdsbregulondb
13.8 match 4 stars 5.20 score 6 scriptslentinj
mfdb:MareFrame DB Querying Library
Creates and manages a PostgreSQL database suitable for storing fisheries data and aggregating ready for use within a Gadget <https://gadget-framework.github.io/gadget2/> model. See <https://mareframe.github.io/mfdb/> for more information.
Maintained by Jamie Lentin. Last updated 3 years ago.
15.0 match 4.76 score 231 scriptsmuschellij2
rscopus:Scopus Database 'API' Interface
Uses Elsevier 'Scopus' API <https://dev.elsevier.com/sc_apis.html> to download information about authors and their citations.
Maintained by John Muschelli. Last updated 1 years ago.
7.6 match 77 stars 9.33 score 124 scripts 3 dependentsropensci
sofa:Connector to 'CouchDB'
Provides an interface to the 'NoSQL' database 'CouchDB' (<http://couchdb.apache.org>). Methods are provided for managing databases within 'CouchDB', including creating/deleting/updating/transferring, and managing documents within databases. One can connect with a local 'CouchDB' instance, or a remote 'CouchDB' databases such as 'Cloudant'. Documents can be inserted directly from vectors, lists, data.frames, and 'JSON'. Targeted at 'CouchDB' v2 or greater.
Maintained by Yaoxiang Li. Last updated 1 months ago.
couchdbdatabasenosqldocumentscloudantcouchdb-client
9.3 match 33 stars 7.51 score 54 scriptsropensci
rredlist:'IUCN' Red List Client
'IUCN' Red List (<https://api.iucnredlist.org/>) client. The 'IUCN' Red List is a global list of threatened and endangered species. Functions cover all of the Red List 'API' routes. An 'API' key is required.
Maintained by William Gearty. Last updated 1 months ago.
iucnbiodiversityapiweb-servicestraitshabitatspeciesconservationapi-wrapperiucn-red-listtaxize
6.1 match 53 stars 11.49 score 195 scripts 24 dependentsbioc
CuratedAtlasQueryR:Queries the Human Cell Atlas
Provides access to a copy of the Human Cell Atlas, but with harmonised metadata. This allows for uniform querying across numerous datasets within the Atlas using common fields such as cell type, tissue type, and patient ethnicity. Usage involves first querying the metadata table for cells of interest, and then downloading the corresponding cells into a SingleCellExperiment object.
Maintained by Stefano Mangiola. Last updated 5 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsdatabaseduckdbhdf5human-cell-atlassingle-cellsinglecellexperimenttidyverse
9.7 match 90 stars 7.04 score 41 scriptsropensci
rmangal:'Mangal' Client
An interface to the 'Mangal' database - a collection of ecological networks. This package includes functions to work with the 'Mangal RESTful API' methods (<https://mangal-interactions.github.io/mangal-api/>).
Maintained by Kevin Cazelles. Last updated 1 years ago.
ecologynetworksfood websinteractionsdata publicationsopen access
13.5 match 14 stars 5.07 score 28 scriptstiledb-inc
tiledbcloud:TileDB Cloud Platform R Client Package
The TileDB Cloud Platform API Client Package offers access to the TileDB Cloud service.
Maintained by John Kerl. Last updated 8 months ago.
13.1 match 1 stars 5.22 score 92 scriptsplotly
plotly:Create Interactive Web Graphics via 'plotly.js'
Create interactive web graphics from 'ggplot2' graphs and/or a custom interface to the (MIT-licensed) JavaScript library 'plotly.js' inspired by the grammar of graphics.
Maintained by Carson Sievert. Last updated 3 months ago.
d3jsdata-visualizationggplot2javascriptplotlyshinywebgl
3.5 match 2.6k stars 19.43 score 93k scripts 797 dependentsbioc
multiMiR:Integration of multiple microRNA-target databases with their disease and drug associations
A collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).
Maintained by Spencer Mahaffey. Last updated 5 months ago.
mirnadatahomo_sapiens_datamus_musculus_datarattus_norvegicus_dataorganismdatamicrorna-sequencesql
8.0 match 20 stars 8.45 score 141 scriptsbioc
ChemmineR:Cheminformatics Toolkit for R
ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.
Maintained by Thomas Girke. Last updated 5 months ago.
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomicscpp
7.2 match 14 stars 9.42 score 253 scripts 12 dependentsironholds
urltools:Vectorised Tools for URL Handling and Parsing
A toolkit for all URL-handling needs, including encoding and decoding, parsing, parameter extraction and modification. All functions are designed to be both fast and entirely vectorised. It is intended to be useful for people dealing with web-related datasets, such as server-side logs, although may be useful for other situations involving large sets of URLs.
Maintained by Os Keyes. Last updated 4 years ago.
5.0 match 131 stars 13.43 score 968 scripts 264 dependentsspectra-to-knowledge
SpectraToQueries:Spectra to queries
SpectraToQueries provides the infrastructure to translate spectra to queries.
Maintained by Adriano Rutz. Last updated 20 days ago.
knowledge extractionspectral informationquerying system
22.1 match 1 stars 3.02 scoreropensci
nasapower:NASA POWER API Client
An API client for NASA POWER global meteorology, surface solar energy and climatology data API. POWER (Prediction Of Worldwide Energy Resources) data are freely available for download with varying spatial resolutions dependent on the original data and with several temporal resolutions depending on the POWER parameter and community. This work is funded through the NASA Earth Science Directorate Applied Science Program. For more on the data themselves, the methodologies used in creating, a web- based data viewer and web access, please see <https://power.larc.nasa.gov/>.
Maintained by Adam H. Sparks. Last updated 11 days ago.
nasameteorological-dataweatherglobalweather-datameteorologynasa-poweragroclimatologyearth-sciencedata-accessclimate-dataagroclimatology-dataweather-variables
6.6 match 101 stars 9.98 score 137 scripts 3 dependentsebeneditos
telegram.bot:Develop a 'Telegram Bot' with R
Provides a pure interface for the 'Telegram Bot API' <http://core.telegram.org/bots/api>. In addition to the pure API implementation, it features a number of tools to make the development of 'Telegram' bots with R easy and straightforward, providing an easy-to-use interface that takes some work off the programmer.
Maintained by Ernest Benedito. Last updated 3 years ago.
7.6 match 109 stars 8.57 score 126 scripts 1 dependentscnathe
Rlabkey:Data Exchange Between R and 'LabKey' Server
The 'LabKey' client library for R makes it easy for R users to load live data from a 'LabKey' Server, <https://www.labkey.com/>, into the R environment for analysis, provided users have permissions to read the data. It also enables R users to insert, update, and delete records stored on a 'LabKey' Server, provided they have appropriate permissions to do so.
Maintained by Cory Nathe. Last updated 3 days ago.
15.3 match 4.25 score 388 scripts 1 dependentsbillpetti
baseballr:Acquiring and Analyzing Baseball Data
Provides numerous utilities for acquiring and analyzing baseball data from online sources such as 'Baseball Reference' <https://www.baseball-reference.com/>, 'FanGraphs' <https://www.fangraphs.com/>, and the 'MLB Stats' API <https://www.mlb.com/>.
Maintained by Saiem Gilani. Last updated 4 months ago.
baseballpitchfxsabermetricsstatcast
7.2 match 380 stars 8.98 score 582 scriptsalexpate30
rcprd:Extraction and Management of Clinical Practice Research Datalink Data
Simplify the process of extracting and processing Clinical Practice Research Datalink (CPRD) data in order to build datasets ready for statistical analysis. This process is difficult in 'R', as the raw data is very large and cannot be read into the R workspace. 'rcprd' utilises 'RSQLite' to create 'SQLite' databases which are stored on the hard disk. These are then queried to extract the required information for a cohort of interest, and create datasets ready for statistical analysis. The processes follow closely that from the 'rEHR' package, see Springate et al., (2017) <doi:10.1371/journal.pone.0171784>.
Maintained by Alexander Pate. Last updated 19 days ago.
11.6 match 2 stars 5.48 score 5 scriptscran
haploR:Query 'HaploReg', 'RegulomeDB'
A set of utilities for querying 'HaploReg' <https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php>, 'RegulomeDB' <https://www.regulomedb.org/regulome-search/> web-based tools. The package connects to 'HaploReg', 'RegulomeDB' searches and downloads results, without opening web pages, directly from R environment. Results are stored in a data frame that can be directly used in various kinds of downstream analyses.
Maintained by Ilya Y. Zhbannikov. Last updated 1 years ago.
19.5 match 1 stars 3.24 scoreroux-ohdsi
allofus:Interface for 'All of Us' Researcher Workbench
Streamline use of the 'All of Us' Researcher Workbench (<https://www.researchallofus.org/data-tools/workbench/>)with tools to extract and manipulate data from the 'All of Us' database. Increase interoperability with the Observational Health Data Science and Informatics ('OHDSI') tool stack by decreasing reliance of 'All of Us' tools and allowing for cohort creation via 'Atlas'. Improve reproducible and transparent research using 'All of Us'.
Maintained by Rob Cavanaugh. Last updated 4 months ago.
8.8 match 16 stars 7.19 score 30 scriptsrapporter
pander:An R 'Pandoc' Writer
Contains some functions catching all messages, 'stdout' and other useful information while evaluating R code and other helpers to return user specified text elements (like: header, paragraph, table, image, lists etc.) in 'pandoc' markdown or several type of R objects similarly automatically transformed to markdown format. Also capable of exporting/converting (the resulting) complex 'pandoc' documents to e.g. HTML, 'PDF', 'docx' or 'odt'. This latter reporting feature is supported in brew syntax or with a custom reference class with a smarty caching 'backend'.
Maintained by Gergely Daróczi. Last updated 16 days ago.
literate-programmingmarkdownpandocpandoc-markdownreproducible-researchrmarkdowncpp
3.8 match 297 stars 16.60 score 7.6k scripts 108 dependentssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
3.7 match 2.4k stars 16.86 score 50k scripts 73 dependentsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 11 days ago.
5.0 match 46 stars 12.34 score 1.2k scripts 13 dependentsianmcook
queryparser:Translate 'SQL' Queries into 'R' Expressions
Translate 'SQL' 'SELECT' statements into lists of 'R' expressions.
Maintained by Ian Cook. Last updated 2 years ago.
12.3 match 54 stars 5.02 score 13 scripts 1 dependentsjosesamos
rolap:Obtaining Star Databases from Flat Tables
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a ROLAP (Relational On-Line Analytical Processing) star database. The main objective of the package is to allow the definition of these transformations easily. The implementation of the multidimensional database obtained can be exported to work with multidimensional analysis tools on spreadsheets or relational databases.
Maintained by Jose Samos. Last updated 1 years ago.
10.0 match 5 stars 6.12 score 25 scripts 1 dependentsmtmorgan
rjsoncons:Query, Pivot, Patch, and Validate 'JSON' and 'NDJSON'
Functions to query (filter or transform), pivot (convert from array-of-objects to object-of-arrays, for easy import as 'R' data frame), search, patch (edit), and validate (against 'JSON Schema') 'JSON' and 'NDJSON' strings, files, or URLs. Query and pivot support 'JSONpointer', 'JSONpath' or 'JMESpath' expressions. The implementation uses the 'jsoncons' <https://danielaparker.github.io/jsoncons/> header-only library; the library is easily linked to other packages for direct access to 'C++' functionality not implemented here.
Maintained by Martin Morgan. Last updated 6 months ago.
8.6 match 9 stars 7.08 score 8 scripts 9 dependentsjosesamos
geomultistar:Multidimensional Queries Enriched with Geographic Data
Multidimensional systems allow complex queries to be carried out in an easy way. The geographical dimension, together with the temporal dimension, plays a fundamental role in multidimensional systems. Through this package, vector geographic data layers can be associated to the attributes of geographic dimensions, so that the results of multidimensional queries can be obtained directly as vector layers. The multidimensional structures on which we can define the queries can be created from a flat table or imported directly using functions from this package.
Maintained by Jose Samos. Last updated 8 months ago.
13.5 match 2 stars 4.48 score 8 scripts 1 dependentshojsgaard
gRain:Bayesian Networks
Probability propagation in Bayesian networks, also known as graphical independence networks. Documentation of the package is provided in vignettes included in the package and in the paper by Højsgaard (2012, <doi:10.18637/jss.v046.i10>). See 'citation("gRain")' for details.
Maintained by Søren Højsgaard. Last updated 5 months ago.
6.5 match 2 stars 9.13 score 408 scripts 8 dependentsohdsi
ResultModelManager:Result Model Manager
Database data model management utilities for R packages in the Observational Health Data Sciences and Informatics program <https://ohdsi.org>. 'ResultModelManager' provides utility functions to allow package maintainers to migrate existing SQL database models, export and import results in consistent patterns.
Maintained by Jamie Gilbert. Last updated 6 months ago.
8.1 match 4 stars 7.38 score 9 scripts 3 dependentsusaid-oha-si
grabr:OHA/SI APIs Package
Provides a series of base functions useful to the GH OHA SI team. These function extend the utility functions in glamr, focusing primarily on API utility functions.
Maintained by Aaron Chafetz. Last updated 6 months ago.
11.4 match 1 stars 5.14 score 69 scriptsropensci
elastic:General Purpose Interface to 'Elasticsearch'
Connect to 'Elasticsearch', a 'NoSQL' database built on the 'Java' Virtual Machine. Interacts with the 'Elasticsearch' 'HTTP' API (<https://www.elastic.co/elasticsearch/>), including functions for setting connection details to 'Elasticsearch' instances, loading bulk data, searching for documents with both 'HTTP' query variables and 'JSON' based body requests. In addition, 'elastic' provides functions for interacting with API's for 'indices', documents, nodes, clusters, an interface to the cat API, and more.
Maintained by Scott Chamberlain. Last updated 2 years ago.
databaseelasticsearchhttpapisearchnosqljavajsondocumentsdata-sciencedatabase-wrapperetl
6.5 match 247 stars 8.98 score 151 scripts 1 dependentsropensci
ghql:General Purpose 'GraphQL' Client
A 'GraphQL' client, with an R6 interface for initializing a connection to a 'GraphQL' instance, and methods for constructing queries, including fragments and parameterized queries. Queries are checked with the 'libgraphqlparser' C++ parser via the 'graphql' package.
Maintained by Mark Padgham. Last updated 2 years ago.
httpapiweb-servicescurldatagraphqlgraphql-apigraphql-client
7.0 match 147 stars 8.24 score 111 scripts 5 dependentsr-lib
systemfonts:System Native Font Finding
Provides system native access to the font catalogue. As font handling varies between systems it is difficult to correctly locate installed fonts across different operating systems. The 'systemfonts' package provides bindings to the native libraries on Windows, macOS and Linux for finding font files that can then be used further by e.g. graphic devices. The main use is intended to be from compiled code but 'systemfonts' also provides access from R.
Maintained by Thomas Lin Pedersen. Last updated 2 months ago.
3.7 match 95 stars 15.62 score 384 scripts 990 dependentsdaranzolin
sqltargets:'Targets' Extension for 'SQL' Queries
Provides an extension for 'SQL' queries as separate file within 'targets' pipelines. The shorthand creates two targets, the query file and the query result.
Maintained by David Ranzolin. Last updated 6 months ago.
10.0 match 39 stars 5.72 score 18 scriptsbioc
AnnotationHub:Client to access AnnotationHub resources
This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
4.1 match 17 stars 13.89 score 2.7k scripts 102 dependentswikimedia
WikidataQueryServiceR:API Client Library for 'Wikidata Query Service'
An API client for the 'Wikidata Query Service' <https://query.wikidata.org/>.
Maintained by Mikhail Popov. Last updated 5 years ago.
7.4 match 28 stars 7.67 score 73 scripts 31 dependentsbioc
RCAS:RNA Centric Annotation System
RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.
Maintained by Bora Uyar. Last updated 5 months ago.
softwaregenetargetmotifannotationmotifdiscoverygotranscriptomicsgenomeannotationgenesetenrichmentcoverage
9.0 match 6.32 score 29 scripts 1 dependentsjoycekang
symphony:Efficient and Precise Single-Cell Reference Atlas Mapping
Implements the Symphony single-cell reference building and query mapping algorithms and additional functions described in Kang et al <https://www.nature.com/articles/s41467-021-25957-x>.
Maintained by Joyce Kang. Last updated 2 years ago.
14.8 match 3.83 score 134 scriptsopenjusticeok
ojodb:Analyze Data from the Open Justice Oklahoma Database
{ojodb} provides convenient functions to query court data from the Open Justice Oklahoma database.
Maintained by Brancen Gregory. Last updated 8 months ago.
courtsjusticeoklahomaopen-data
11.9 match 8 stars 4.74 score 69 scriptsdieghernan
nominatimlite:Interface with 'Nominatim' API Service
Lite interface for getting data from 'OSM' service 'Nominatim' <https://nominatim.org/release-docs/latest/>. Extract coordinates from addresses, find places near a set of coordinates and return spatial objects on 'sf' format.
Maintained by Diego Hernangómez. Last updated 1 months ago.
geocodingopenstreetmapaddressnominatimreverse-geocodingshapefilespatialapi-wrapperapigis
7.0 match 20 stars 8.08 score 41 scripts 1 dependentsepicentre-msf
queryr:Data Validation Queries With Tidy Output
Data validation queries with tidy, stackable output.
Maintained by Patrick Barks. Last updated 8 months ago.
18.1 match 4 stars 3.08 score 8 scripts 2 dependentscamembr
microinverterdata:Collect your Microinverter Data
Collect and normalize local microinverter energy and power production data through off-cloud API requests. Currently supports 'APSystems', 'Enphase', and 'Fronius' microinverters.
Maintained by Christophe Regouby. Last updated 21 days ago.
10.9 match 1 stars 5.08 score 4 scriptsbioc
brendaDb:The BRENDA Enzyme Database
R interface for importing and analyzing enzyme information from the BRENDA database.
Maintained by Yi Zhou. Last updated 5 months ago.
thirdpartyclientannotationdataimportbrendadatabaseenzymehacktoberfestcpp
12.0 match 2 stars 4.60 score 4 scriptschengjunhou
xgb2sql:Convert Trained 'XGBoost' Model to SQL Query
This tool enables in-database scoring of 'XGBoost' models built in R, by translating trained model objects into SQL query. 'XGBoost' <https://github.com/dmlc/xgboost> provides parallel tree boosting (also known as gradient boosting machine, or GBM) algorithms in a highly efficient, flexible and portable way. GBM algorithm is introduced by Friedman (2001) <doi:10.1214/aos/1013203451>, and more details on 'XGBoost' can be found in Chen & Guestrin (2016) <doi:10.1145/2939672.2939785>.
Maintained by Chengjun Hou. Last updated 3 years ago.
10.9 match 22 stars 5.04 score 7 scriptseitsupi
neopolars:R Bindings for the 'polars' Rust Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Tatsuya Shima. Last updated 1 days ago.
11.3 match 40 stars 4.86 score 1 scriptsmichalovadek
eurlex:Retrieve Data on European Union Law
Access to data on European Union laws and court decisions made easy with pre-defined 'SPARQL' queries and 'GET' requests. See Ovadek (2021) <doi:10.1080/2474736X.2020.1870150> .
Maintained by Michal Ovadek. Last updated 7 months ago.
courtseurlexeuropean-unionlawlegislationsparql
8.9 match 36 stars 6.18 score 21 scriptsbioc
OmnipathR:OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Maintained by Denes Turei. Last updated 19 days ago.
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
5.5 match 126 stars 9.90 score 226 scripts 2 dependentsbioc
biomaRt:Interface to BioMart databases (i.e. Ensembl)
In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.
Maintained by Mike Smith. Last updated 2 days ago.
annotationbioconductorbiomartensembl
3.4 match 38 stars 15.99 score 13k scripts 230 dependentsbioc
BiocNeighbors:Nearest Neighbor Detection for Bioconductor Packages
Implements exact and approximate methods for nearest neighbor detection, in a framework that allows them to be easily switched within Bioconductor packages or workflows. Exact searches can be performed using the k-means for k-nearest neighbors algorithm or with vantage point trees. Approximate searches can be performed using the Annoy or HNSW libraries. Searching on either Euclidean or Manhattan distances is supported. Parallelization is achieved for all methods by using BiocParallel. Functions are also provided to search for all neighbors within a given distance.
Maintained by Aaron Lun. Last updated 12 days ago.
5.4 match 10.14 score 646 scripts 89 dependentsyonicd
d3Tree:Create Interactive Collapsible Trees with the JavaScript 'D3' Library
Create and customize interactive collapsible 'D3' trees using the 'D3' JavaScript library and the 'htmlwidgets' package. These trees can be used directly from the R console, from 'RStudio', in Shiny apps and R Markdown documents. When in Shiny the tree layout is observed by the server and can be used as a reactive filter of structured data.
Maintained by Jonathan Sidi. Last updated 1 years ago.
d3jshierarchyhtmlwidgetsqueryshiny
10.0 match 87 stars 5.46 score 33 scriptscran
arcpullr:Pull Data from an 'ArcGIS REST' API
Functions to efficiently query 'ArcGIS REST' APIs <https://developers.arcgis.com/rest/>. Both spatial and SQL queries can be used to retrieve data. Simple Feature (sf) objects are utilized to perform spatial queries. This package was neither produced nor is maintained by Esri.
Maintained by Paul Frater. Last updated 1 months ago.
13.8 match 3.95 score 1 dependentsr-hub
rversions:Query 'R' Versions, Including 'r-release' and 'r-oldrel'
Query the main 'R' 'SVN' repository to find the versions 'r-release' and 'r-oldrel' refer to, and also all previous 'R' versions and their release dates.
Maintained by Gábor Csárdi. Last updated 3 years ago.
5.1 match 40 stars 10.51 score 55 scripts 154 dependentsusaid-mozambique
sonata:Interagir com MozART 2.0
Provides a set of utilities and functions for connecting, querying, and analyzing data from the Mozambique MozART 2.0 database.
Maintained by Joe Lara. Last updated 2 days ago.
13.2 match 4.08 scorecoatless-rpkg
searcher:Query Search Interfaces
Provides a search interface to look up terms on 'Google', 'Bing', 'DuckDuckGo', 'Startpage', 'Ecosia', 'rseek', 'Twitter', 'StackOverflow', 'RStudio Community', 'GitHub', and 'BitBucket'. Upon searching, a browser window will open with the aforementioned search results.
Maintained by James Balamuta. Last updated 6 months ago.
automaticerror-handlingerror-messagessearch-enginesearch-portals
7.0 match 72 stars 7.69 score 19 scripts 3 dependentsjustincally
VicmapR:Access Victorian Spatial Data Through Web File Services (WFS)
Easily interfaces R to spatial datasets available through the Victorian Government's WFS (Web Feature Service): <https://opendata.maps.vic.gov.au/geoserver/ows?request=GetCapabilities&service=wfs>, which allows users to read in 'sf' data from these sources. VicmapR uses the lazy querying approach and code developed by Teucher et al. (2021) for the 'bcdata' R package <doi:10.21105/joss.02927>.
Maintained by Justin Cally. Last updated 6 months ago.
8.7 match 17 stars 6.14 score 18 scriptsbioc
mygene:Access MyGene.Info_ services
MyGene.Info_ provides simple-to-use REST web services to query/retrieve gene annotation data. It's designed with simplicity and performance emphasized. *mygene*, is an easy-to-use R wrapper to access MyGene.Info_ services.
Maintained by Adam Mark, Cyrus Afrasiabi, Chunlei Wu. Last updated 5 months ago.
7.4 match 7.17 score 330 scripts 1 dependentsts404
WikidataR:Read-Write API Client Library for Wikidata
Read from, interrogate, and write to Wikidata <https://www.wikidata.org> - the multilingual, interdisciplinary, semantic knowledgebase. Includes functions to: read from Wikidata (single items, properties, or properties); query Wikidata (retrieving all items that match a set of criteria via Wikidata SPARQL query service); write to Wikidata (adding new items or statements via QuickStatements); and handle and manipulate Wikidata objects (as lists and tibbles). Uses the Wikidata and QuickStatements APIs.
Maintained by Thomas Shafee. Last updated 2 months ago.
6.1 match 22 stars 8.64 score 109 scripts 25 dependentskeithmcnulty
neo4jshell:Querying and Managing 'Neo4J' Databases in 'R'
Sends queries to a specified 'Neo4J' graph database, capturing results in a dataframe where appropriate. Other useful functions for the importing and management of data on the 'Neo4J' server and basic local server admin.
Maintained by Keith McNulty. Last updated 3 years ago.
8.9 match 18 stars 5.85 score 13 scriptsmajerr
sqlhelper:Easier 'SQL' Integration
Execute files of 'SQL' and manage database connections. 'SQL' statements and queries may be interpolated with string literals. Execution of individual statements and queries may be controlled with keywords. Multiple connections may be defined with 'YAML' and accessed by name.
Maintained by Matthew Roberts. Last updated 1 years ago.
10.0 match 2 stars 5.19 score 39 scriptsquandl
Quandl:API Wrapper for Quandl.com
Functions for interacting directly with the Quandl API to offer data in a number of formats usable in R, downloading a zip with all data from a Quandl database, and the ability to search. This R package uses the Quandl API. For more information go to <https://docs.quandl.com>. For more help on the package itself go to <https://www.quandl.com/tools/r>.
Maintained by Dave Dotson. Last updated 3 years ago.
5.3 match 137 stars 9.74 score 980 scripts 3 dependentspmassicotte
gtrendsR:Perform and Display Google Trends Queries
An interface for retrieving and displaying the information returned online by Google Trends is provided. Trends (number of hits) over the time as well as geographic representation of the results can be displayed.
Maintained by Philippe Massicotte. Last updated 7 months ago.
5.0 match 356 stars 10.35 score 716 scripts 1 dependentsips-lmu
emuR:Main Package of the EMU Speech Database Management System
Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.
Maintained by Markus Jochim. Last updated 1 years ago.
7.5 match 24 stars 6.89 score 135 scripts 1 dependentsinrae
hubeau:Get Data from the French National Database on Water 'Hub'Eau'
Collection of functions to help retrieving data from 'Hub'Eau' the free and public French National APIs on water <https://hubeau.eaufrance.fr/>.
Maintained by David Dorchies. Last updated 2 months ago.
8.0 match 12 stars 6.49 score 19 scriptscran
osdatahub:Easier Interaction with the Ordnance Survey Data Hub
Ordnance Survey ('OS') is the national mapping agency for Great Britain and produces a large variety of mapping and geospatial products. Much of OS's data is available via the OS Data Hub <https://osdatahub.os.uk/>, a platform that hosts both free and premium data products. 'osdatahub' provides a user-friendly way to access, query, and download these data.
Maintained by Chris Jochem. Last updated 1 years ago.
14.8 match 3.45 score 14 scriptskrystian8207
queryBuilder:Programmatic Way to Construct Complex Filtering Queries
Syntax for defining complex filtering expressions in a programmatic way. A filtering query, built as a nested list configuration, can be easily stored in other formats like 'YAML' or 'JSON'. What's more, it's possible to convert such configuration to a valid expression that can be applied to popular 'dplyr' package operations.
Maintained by Krystian Igras. Last updated 6 months ago.
16.0 match 3.18 score 8 scripts 1 dependentsvgherard
r2r:R-Object to R-Object Hash Maps
Implementation of hash tables (hash sets and hash maps) in R, featuring arbitrary R objects as keys, arbitrary hash and key-comparison functions, and customizable behaviour upon queries of missing keys.
Maintained by Valerio Gherardi. Last updated 4 months ago.
6.9 match 3 stars 7.36 score 82 scripts 28 dependentslindbrook
packageRank:Computation and Visualization of Package Download Counts and Percentile Ranks
Compute and visualize package download counts and percentile ranks from Posit/RStudio's CRAN mirror.
Maintained by lindbrook. Last updated 4 days ago.
8.3 match 28 stars 6.13 score 27 scriptswikihistories
wikkitidy:Tidy Analysis of Wikipedia
Access 'Wikipedia' through the several 'MediaWiki' APIs (<https://www.mediawiki.org/wiki/API>), as well as through the 'XTools' API (<https://www.mediawiki.org/wiki/XTools/API>). Ensure your API calls are correct, and receive results in tidy tibbles.
Maintained by Michael Falk. Last updated 1 months ago.
12.6 match 7 stars 4.02 score 2 scriptsropensci
restez:Create and Query a Local Copy of 'GenBank' in R
Download large sections of 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and generate a local SQL-based database. A user can then query this database using 'restez' functions or through 'rentrez' <https://CRAN.R-project.org/package=rentrez> wrappers.
Maintained by Joel H. Nitta. Last updated 10 days ago.
7.2 match 26 stars 7.01 score 175 scripts 1 dependentsmgondan
rswipl:Embed 'SWI'-'Prolog'
Interface to 'SWI'-'Prolog', <https://www.swi-prolog.org/>. This package is normally not loaded directly, please refer to package 'rolog' instead. The purpose of this package is to provide the 'Prolog' runtime on systems that do not have a software installation of 'SWI'-'Prolog'.
Maintained by Matthias Gondan. Last updated 8 days ago.
10.4 match 4.84 score 1 scripts 2 dependentsbioc
scmap:A tool for unsupervised projection of single cell RNA-seq data
Single-cell RNA-seq (scRNA-seq) is widely used to investigate the composition of complex tissues since the technology allows researchers to define cell-types using unsupervised clustering of the transcriptome. However, due to differences in experimental methods and computational analyses, it is often challenging to directly compare the cells identified in two different experiments. scmap is a method for projecting cells from a scRNA-seq experiment on to the cell-types or individual cells identified in a different experiment.
Maintained by Vladimir Kiselev. Last updated 5 months ago.
immunooncologysinglecellsoftwareclassificationsupportvectormachinernaseqvisualizationtranscriptomicsdatarepresentationtranscriptionsequencingpreprocessinggeneexpressiondataimportbioconductor-packagehuman-cell-atlasprojection-mappingsingle-cell-rna-seqopenblascpp
5.6 match 95 stars 8.82 score 172 scriptseblondel
ows4R:Interface to OGC Web-Services (OWS)
Provides an Interface to Web-Services defined as standards by the Open Geospatial Consortium (OGC), including Web Feature Service (WFS) for vector data, Web Coverage Service (WCS), Catalogue Service (CSW) for ISO/OGC metadata, Web Processing Service (WPS) for data processes, and associated standards such as the common web-service specification (OWS) and OGC Filter Encoding. Partial support is provided for the Web Map Service (WMS). The purpose is to add support for additional OGC service standards such as Web Coverage Processing Service (WCPS), the Sensor Observation Service (SOS), or even new standard services emerging such OGC API or SensorThings.
Maintained by Emmanuel Blondel. Last updated 1 months ago.
catalogue-servicecswdataaccessfesgeospatialisoogcowssdispatialspatial-datastandardwebfeatureservicewfs
5.5 match 38 stars 9.03 score 99 scripts 5 dependentshrbrmstr
sergeant:Tools to Transform and Query Data with Apache Drill
Apache Drill is a low-latency distributed query engine designed to enable data exploration and analysis on both relational and non-relational data stores, scaling to petabytes of data. Methods are provided that enable working with Apache Drill instances via the REST API, DBI methods and using 'dplyr'/'dbplyr' idioms. Helper functions are included to facilitate using official Drill Docker images/containers.
Maintained by Bob Rudis. Last updated 4 years ago.
14.3 match 3.45 score 56 scriptsjohndharrison
seleniumPipes:R Client Implementing the W3C WebDriver Specification
The W3C WebDriver specification defines a way for out-of-process programs to remotely instruct the behaviour of web browsers. It is detailed at <https://w3c.github.io/webdriver/webdriver-spec.html>. This package provides an R client implementing the W3C specification.
Maintained by John Harrison. Last updated 8 years ago.
7.4 match 54 stars 6.66 score 168 scriptsbroccolito
bolt4jr:Interface for the 'Neo4j Bolt' Protocol
Querying, extracting, and processing large-scale network data from Neo4j databases using the 'Neo4j Bolt' <https://neo4j.com/docs/bolt/current/bolt/> protocol. This interface supports efficient data retrieval, batch processing for large datasets, and seamless conversion of query results into R data frames, making it ideal for bioinformatics, computational biology, and other graph-based applications.
Maintained by Wanjun Gu. Last updated 4 days ago.
10.5 match 4.65 score 2 scriptsbioc
hca:Exploring the Human Cell Atlas Data Coordinating Platform
This package provides users with the ability to query the Human Cell Atlas data repository for single-cell experiment data. The `projects()`, `files()`, `samples()` and `bundles()` functions retrieve summary information on each of these indexes; corresponding `*_details()` are available for individual entries of each index. File-based resources can be downloaded using `files_download()`. Advanced use of the package allows the user to page through large result sets, and to flexibly query the 'list-of-lists' structure representing query responses.
Maintained by Martin Morgan. Last updated 5 months ago.
14.5 match 3.34 score 55 scriptsreichlab
zoltr:Interface to the 'Zoltar' Forecast Repository API
'Zoltar' <https://www.zoltardata.com/> is a website that provides a repository of model forecast results in a standardized format and a central location. It supports storing, retrieving, comparing, and analyzing time series forecasts for prediction challenges of interest to the modeling community. This package provides functions for working with the 'Zoltar' API, including connecting and authenticating, getting meta information (projects, models, and forecasts, and truth), and uploading, downloading, and deleting forecast and truth data.
Maintained by Matthew Cornell. Last updated 10 days ago.
6.4 match 2 stars 7.58 score 175 scripts 3 dependentsr-arcgis
arcgislayers:An Interface to ArcGIS Data Services
Enables users of 'ArcGIS Enterprise', 'ArcGIS Online', or 'ArcGIS Platform' to read, write, publish, or manage vector and raster data via ArcGIS location services REST API endpoints <https://developers.arcgis.com/rest/>.
Maintained by Josiah Parry. Last updated 18 days ago.
6.0 match 50 stars 8.08 score 38 scripts 4 dependentsbioc
hermes:Preprocessing, analyzing, and reporting of RNA-seq data
Provides classes and functions for quality control, filtering, normalization and differential expression analysis of pre-processed `RNA-seq` data. Data can be imported from `SummarizedExperiment` as well as `matrix` objects and can be annotated from `BioMart`. Filtering for genes without too low expression or containing required annotations, as well as filtering for samples with sufficient correlation to other samples or total number of reads is supported. The standard normalization methods including cpm, rpkm and tpm can be used, and 'DESeq2` as well as voom differential expression analyses are available.
Maintained by Daniel Sabanés Bové. Last updated 5 months ago.
rnaseqdifferentialexpressionnormalizationpreprocessingqualitycontrolrna-seqstatistical-engineering
6.2 match 11 stars 7.77 score 48 scripts 1 dependentsironholds
WikipediR:A MediaWiki API Wrapper
A wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia. It can be used to retrieve page text, information about users or the history of pages, and elements of the category tree.
Maintained by Os Keyes. Last updated 12 months ago.
api-clientapi-wrappermediawiki
5.0 match 70 stars 9.56 score 81 scripts 32 dependentstudo-r
BatchJobs:Batch Computing with R
Provides Map, Reduce and Filter variants to generate jobs on batch computing systems like PBS/Torque, LSF, SLURM and Sun Grid Engine. Multicore and SSH systems are also supported. For further details see the project web page.
Maintained by Bernd Bischl. Last updated 3 years ago.
5.5 match 85 stars 8.57 score 616 scripts 3 dependentscrunch-io
crunch:Crunch.io Data Tools
The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.
Maintained by Greg Freedman Ellis. Last updated 11 days ago.
4.5 match 9 stars 10.53 score 200 scripts 2 dependentsnikolaus77
rocker:Database Interface Class
'R6' class interface for handling relational database connections using 'DBI' package as backend. The class allows handling of connections to e.g. PostgreSQL, MariaDB and SQLite. The purpose is having an intuitive object allowing straightforward handling of SQL databases.
Maintained by Nikolaus Pawlowski. Last updated 3 years ago.
databasedbimariadbmysqlpostgrespostgresqlr6sqlsqlite
9.0 match 5 stars 5.24 score 7 scriptssckott
request:High Level 'HTTP' Client
High level and easy 'HTTP' client for 'R' that makes assumptions that should work in most cases. Provides functions for building 'HTTP' queries, including query parameters, body requests, headers, authentication, and more.
Maintained by Scott Chamberlain. Last updated 5 years ago.
7.6 match 36 stars 6.16 score 812 scriptstidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 13 days ago.
1.9 match 4.8k stars 24.68 score 659k scripts 7.8k dependentsropensci
geonames:Interface to the "Geonames" Spatial Query Web Service
The web service at <https://www.geonames.org/> provides a number of spatial data queries, including administrative area hierarchies, city locations and some country postal code queries. A (free) username is required and rate limits exist.
Maintained by Barry Rowlingson. Last updated 6 years ago.
5.5 match 37 stars 8.45 score 165 scripts 21 dependentsbioc
drugTargetInteractions:Drug-Target Interactions
Provides utilities for identifying drug-target interactions for sets of small molecule or gene/protein identifiers. The required drug-target interaction information is obained from a local SQLite instance of the ChEMBL database. ChEMBL has been chosen for this purpose, because it provides one of the most comprehensive and best annotatated knowledge resources for drug-target information available in the public domain.
Maintained by Thomas Girke. Last updated 5 months ago.
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsproteomicsmetabolomics
10.5 match 1 stars 4.34 score 11 scriptsedjnet
tidywikidatar:Explore 'Wikidata' Through Tidy Data Frames
Query 'Wikidata' API <https://www.wikidata.org/wiki/Wikidata:Main_Page> with ease, get tidy data frames in response, and cache data in a local database.
Maintained by Giorgio Comai. Last updated 8 months ago.
5.8 match 26 stars 7.86 score 46 scripts 2 dependentsbioc
celaref:Single-cell RNAseq cell cluster labelling by reference
After the clustering step of a single-cell RNAseq experiment, this package aims to suggest labels/cell types for the clusters, on the basis of similarity to a reference dataset. It requires a table of read counts per cell per gene, and a list of the cells belonging to each of the clusters, (for both test and reference data).
Maintained by Sarah Williams. Last updated 5 months ago.
11.3 match 4.00 score 5 scriptsbioc
SpectraQL:MassQL support for Spectra
The Mass Spec Query Language (MassQL) is a domain-specific language enabling to express a query and retrieve mass spectrometry (MS) data in a more natural and understandable way for MS users. It is inspired by SQL and is by design programming language agnostic. The SpectraQL package adds support for the MassQL query language to R, in particular to MS data represented by Spectra objects. Users can thus apply MassQL expressions to analyze and retrieve specific data from Spectra objects.
Maintained by Johannes Rainer. Last updated 5 months ago.
infrastructureproteomicsmassspectrometrymetabolomics
8.6 match 7 stars 5.24 score 2 scriptsbioc
GenomicFeatures:Query the gene models of a given organism/assembly
Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.
Maintained by H. Pagès. Last updated 4 months ago.
geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package
2.9 match 26 stars 15.34 score 5.3k scripts 339 dependentsyihui
knitr:A General-Purpose Package for Dynamic Report Generation in R
Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.
Maintained by Yihui Xie. Last updated 2 days ago.
dynamic-documentsknitrliterate-programmingrmarkdownsweave
1.9 match 2.4k stars 23.62 score 116k scripts 4.2k dependentsbioc
signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis
This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.
Maintained by Brendan Gongol. Last updated 5 months ago.
softwaregeneexpressiongokeggnetworkenrichmentsequencingcoveragedifferentialexpressioncpp
6.2 match 17 stars 7.18 score 74 scripts 1 dependentspaws-r
paws:Amazon Web Services Software Development Kit
Interface to Amazon Web Services <https://aws.amazon.com>, including storage, database, and compute services, such as 'Simple Storage Service' ('S3'), 'DynamoDB' 'NoSQL' database, and 'Lambda' functions-as-a-service.
Maintained by Dyfan Jones. Last updated 4 days ago.
3.9 match 332 stars 11.25 score 177 scripts 12 dependentscran
ODataQuery:Querying on 'OData'
Make querying on 'OData' easier. It exposes an 'ODataQuery' object that can be manipulated and provides features such as selection, filtering and ordering.
Maintained by Laurent Verweijen. Last updated 4 years ago.
12.5 match 3.48 score 1 dependentsathammad
pbox:Exploring Multivariate Spaces with Probability Boxes
Advanced statistical library offering a method to encapsulate and query the probability space of a dataset effortlessly using Probability Boxes (p-boxes). Its distinctive feature lies in the ease with which users can navigate and analyze marginal, joint, and conditional probabilities while taking into account the underlying correlation structure inherent in the data using copula theory and models. A comprehensive explanation is available in the paper "pbox: Exploring Multivariate Spaces with Probability Boxes" to be published in the Journal of Statistical Software.
Maintained by Ahmed T. Hammad. Last updated 8 months ago.
climate-changecopulaenvironmental-monitoringfinancial-analysisprobabilityrisk-assessmentrisk-managementstatistics
8.6 match 2 stars 5.04 score 4 scriptsr-hub
pkgsearch:Search and Query CRAN R Packages
Search CRAN metadata about packages by keyword, popularity, recent activity, package name and more. Uses the 'R-hub' search server, see <https://r-pkg.org> and the CRAN metadata database, that contains information about CRAN packages. Note that this is _not_ a CRAN project.
Maintained by Gábor Csárdi. Last updated 2 months ago.
5.0 match 109 stars 8.62 score 64 scripts 10 dependentssatijalab
SeuratObject:Data Structures for Single Cell Data
Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
3.7 match 25 stars 11.69 score 1.2k scripts 88 dependentsbioc
hiAnnotator:Functions for annotating GRanges objects
hiAnnotator contains set of functions which allow users to annotate a GRanges object with custom set of annotations. The basic philosophy of this package is to take two GRanges objects (query & subject) with common set of seqnames (i.e. chromosomes) and return associated annotation per seqnames and rows from the query matching seqnames and rows from the subject (i.e. genes or cpg islands). The package comes with three types of annotation functions which calculates if a position from query is: within a feature, near a feature, or count features in defined window sizes. Moreover, each function is equipped with parallel backend to utilize the foreach package. In addition, the package is equipped with wrapper functions, which finds appropriate columns needed to make a GRanges object from a common data frame.
Maintained by Nirav V Malani. Last updated 5 months ago.
9.2 match 4.65 score 15 scripts 1 dependentsropensci
jqr:Client for 'jq', a 'JSON' Processor
Client for 'jq', a 'JSON' processor (<https://jqlang.github.io/jq/>), written in C. 'jq' allows the following with 'JSON' data: index into, parse, do calculations, cut up and filter, change key names and values, perform conditionals and comparisons, and more.
Maintained by Jeroen Ooms. Last updated 3 months ago.
4.3 match 144 stars 10.04 score 95 scripts 28 dependentsbnosac
udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit
This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.
Maintained by Jan Wijffels. Last updated 2 years ago.
conlldependency-parserlemmatizationnatural-language-processingnlppos-taggingr-pkgrcpptext-miningtokenizerudpipecpp
3.6 match 215 stars 11.83 score 1.2k scripts 9 dependentsopenbiox
UCSCXenaShiny:Interactive Analysis of UCSC Xena Data
Provides functions and a Shiny application for downloading, analyzing and visualizing datasets from UCSC Xena (<http://xena.ucsc.edu/>), which is a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others.
Maintained by Shixiang Wang. Last updated 4 months ago.
cancer-datasetshiny-appsucsc-xena
5.0 match 96 stars 8.54 score 35 scriptsinlabru-org
fmesher:Triangle Meshes and Related Geometry Tools
Generate planar and spherical triangle meshes, compute finite element calculations for 1- and 2-dimensional flat and curved manifolds with associated basis function spaces, methods for lines and polygons, and transparent handling of coordinate reference systems and coordinate transformation, including 'sf' and 'sp' geometries. The core 'fmesher' library code was originally part of the 'INLA' package, and implements parts of "Triangulations and Applications" by Hjelle and Daehlen (2006) <doi:10.1007/3-540-33261-8>.
Maintained by Finn Lindgren. Last updated 2 days ago.
3.8 match 16 stars 11.18 score 261 scripts 26 dependentsbioc
SRAdb:A compilation of metadata from NCBI SRA and tools
The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, and others. However, finding data of interest can be challenging using current tools. SRAdb is an attempt to make access to the metadata associated with submission, study, sample, experiment and run much more feasible. This is accomplished by parsing all the NCBI SRA metadata into a SQLite database that can be stored and queried locally. Fulltext search in the package make querying metadata very flexible and powerful. fastq and sra files can be downloaded for doing alignment locally. Beside ftp protocol, the SRAdb has funcitons supporting fastp protocol (ascp from Aspera Connect) for faster downloading large data files over long distance. The SQLite database is updated regularly as new data is added to SRA and can be downloaded at will for the most up-to-date metadata.
Maintained by Jack Zhu. Last updated 3 months ago.
infrastructuresequencingdataimport
5.3 match 2 stars 7.81 score 200 scriptsbioc
orthos:`orthos` is an R package for variance decomposition using conditional variational auto-encoders
`orthos` decomposes RNA-seq contrasts, for example obtained from a gene knock-out or compound treatment experiment, into unspecific and experiment-specific components. Original and decomposed contrasts can be efficiently queried against a large database of contrasts (derived from ARCHS4, https://maayanlab.cloud/archs4/) to identify similar experiments. `orthos` furthermore provides plotting functions to visualize the results of such a search for similar contrasts.
Maintained by Panagiotis Papasaikas. Last updated 5 days ago.
rnaseqdifferentialexpressiongeneexpression
9.9 match 4.18 score 2 scriptsbioc
MotifDb:An Annotated Collection of Protein-DNA Binding Sequence Motifs
More than 9900 annotated position frequency matrices from 14 public sources, for multiple organisms.
Maintained by Paul Shannon. Last updated 4 days ago.
6.0 match 6.86 score 404 scripts 2 dependents