R-universe search: databases

r-dbi

DBI:R Database Interface

A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.

Maintained by Kirill Müller. Last updated 3 months ago.

database interface

50.4 match 302 stars 20.88 score 19k scripts 2.9k dependents

maialba3

LipidMS:Lipid Annotation for LC-MS/MS DDA or DIA Data

Lipid annotation in untargeted LC-MS lipidomics based on fragmentation rules. Alcoriza-Balaguer MI, Garcia-Canaveras JC, Lopez A, Conde I, Juan O, Carretero J, Lahoz A (2019) <doi:10.1021/acs.analchem.8b03409>.

Maintained by M Isabel Alcoriza-Balaguer. Last updated 7 months ago.

cpp

163.5 match 2 stars 5.33 score 12 scripts 1 dependents

ncss-tech

soilDB:Soil Database Interface

A collection of functions for reading soil data from U.S. Department of Agriculture Natural Resources Conservation Service (USDA-NRCS) and National Cooperative Soil Survey (NCSS) databases.

Maintained by Andrew Brown. Last updated 6 days ago.

kssl nasis nrcs soil soil-data-access soil-survey soilweb sql usda

53.1 match 87 stars 11.34 score 1.0k scripts 1 dependents

tidyverse

dbplyr:A 'dplyr' Back End for Databases

A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.

Maintained by Hadley Wickham. Last updated 3 months ago.

database

27.3 match 481 stars 19.72 score 5.2k scripts 736 dependents

ropensci

tidyhydat:Extract and Tidy Canadian 'Hydrometric' Data

Provides functions to access historical and real-time national 'hydrometric' data from Water Survey of Canada data sources (<https://dd.weather.gc.ca/hydrometric/csv/> and <https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>) and then applies tidy data principles.

Maintained by Sam Albers. Last updated 4 days ago.

citz government-data hydrology hydrometrics tidy-data water-resources

50.8 match 71 stars 9.59 score 202 scripts 3 dependents

bioc

ginmappeR:Gene Identifier Mapper

Provides functionalities to translate gene or protein identifiers between state-of-art biological databases: CARD (<https://card.mcmaster.ca/>), NCBI Protein, Nucleotide and Gene (<https://www.ncbi.nlm.nih.gov/>), UniProt (<https://www.uniprot.org/>) and KEGG (<https://www.kegg.jp>). Also offers complementary functionality like NCBI identical proteins or UniProt similar genes clusters retrieval.

Maintained by Fernando Sola. Last updated 3 months ago.

annotation kegg genetics thirdpartyclient software

99.8 match 4.88 score 7 scripts

cynkra

dm:Relational Data Models

Provides tools for working with multiple related tables, stored as data frames or in a relational database. Multiple tables (data and metadata) are stored in a compound object, which can then be manipulated with a pipe-friendly syntax.

Maintained by Kirill Müller. Last updated 2 months ago.

data-model data-warehousing datawarehousing dbi dbplyr relational-databases

31.5 match 511 stars 14.81 score 410 scripts 8 dependents

ohdsi

DatabaseConnector:Connecting to Various Database Platforms

An R 'DataBase Interface' ('DBI') compatible interface to various database platforms ('PostgreSQL', 'Oracle', 'Microsoft SQL Server', 'Amazon Redshift', 'Microsoft Parallel Database Warehouse', 'IBM Netezza', 'Apache Impala', 'Google BigQuery', 'Snowflake', 'Spark', 'SQLite', and 'InterSystems IRIS'). Also includes support for fetching data as 'Andromeda' objects. Uses either 'Java Database Connectivity' ('JDBC') or other 'DBI' drivers to connect to databases.

Maintained by Martijn Schuemie. Last updated 1 months ago.

hades openjdk

35.4 match 56 stars 12.63 score 772 scripts 11 dependents

r-dbi

RSQLite:SQLite Interface for R

Embeds the SQLite database engine in R and provides an interface compliant with the DBI package. The source for the SQLite engine and for various extensions in a recent version is included. System libraries will never be consulted because this package relies on static linking for the plugins it includes; this also ensures a consistent experience across all installations.

Maintained by Kirill Müller. Last updated 24 days ago.

database sqlite3 cpp

23.5 match 327 stars 18.73 score 8.1k scripts 1.1k dependents

usdaforestservice

FIESTA:Forest Inventory Estimation and Analysis

A research estimation tool for analysts that work with sample-based inventory data from the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program.

Maintained by Grayson White. Last updated 2 days ago.

55.2 match 30 stars 7.24 score 62 scripts

paws-r

paws.database:'Amazon Web Services' Database Services

Interface to 'Amazon Web Services' database services, including 'Relational Database Service' ('RDS'), 'DynamoDB' 'NoSQL' database, and more <https://aws.amazon.com/>.

Maintained by Dyfan Jones. Last updated 3 days ago.

aws aws-sdk

43.4 match 332 stars 9.07 score 3 scripts 13 dependents

bioc

OmnipathR:OmniPath web service client and more

A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).

Maintained by Denes Turei. Last updated 18 days ago.

graphandnetwork network pathways software thirdpartyclient dataimport datarepresentation genesignaling generegulation systemsbiology transcriptomics singlecell annotation kegg complexes enzyme-ptm networks networks-biology omnipath proteins quarto

36.9 match 126 stars 9.90 score 226 scripts 2 dependents

bioc

biodb:biodb, a library and a development framework for connecting to chemical and biological databases

The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.

Maintained by Pierrick Roger. Last updated 5 months ago.

software infrastructure dataimport kegg biology cheminformatics chemistry databases cpp

45.1 match 11 stars 7.85 score 24 scripts 6 dependents

ropensci

biomartr:Genomic Data Retrieval

Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.

Maintained by Hajk-Georg Drost. Last updated 1 months ago.

biomart genomic-data-retrieval annotation-retrieval database-retrieval ncbi ensembl biological-data-retrieval ensembl-servers genome genome-annotation genome-retrieval genomics meta-analysis metagenomics ncbi-genbank peer-reviewed proteome sequenced-genomes

30.7 match 218 stars 11.35 score 129 scripts 3 dependents

josesamos

rolap:Obtaining Star Databases from Flat Tables

Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a ROLAP (Relational On-Line Analytical Processing) star database. The main objective of the package is to allow the definition of these transformations easily. The implementation of the multidimensional database obtained can be exported to work with multidimensional analysis tools on spreadsheets or relational databases.

Maintained by Jose Samos. Last updated 1 years ago.

openjdk

56.2 match 5 stars 6.12 score 25 scripts 1 dependents

r-dbi

RMySQL:Database Interface and 'MySQL' Driver for R

Legacy 'DBI' interface to 'MySQL' / 'MariaDB' based on old code ported from S-PLUS. A modern 'MySQL' client written in 'C++' is available from the 'RMariaDB' package.

Maintained by Jeroen Ooms. Last updated 1 months ago.

database mysql

24.1 match 209 stars 13.68 score 3.7k scripts 15 dependents

ropensci

sofa:Connector to 'CouchDB'

Provides an interface to the 'NoSQL' database 'CouchDB' (<http://couchdb.apache.org>). Methods are provided for managing databases within 'CouchDB', including creating/deleting/updating/transferring, and managing documents within databases. One can connect with a local 'CouchDB' instance, or a remote 'CouchDB' databases such as 'Cloudant'. Documents can be inserted directly from vectors, lists, data.frames, and 'JSON'. Targeted at 'CouchDB' v2 or greater.

Maintained by Yaoxiang Li. Last updated 1 months ago.

couchdb database nosql documents cloudant couchdb-client

43.1 match 33 stars 7.51 score 54 scripts

jonesor

Rcompadre:Utilities for using the 'COM(P)ADRE' Matrix Model Database

Utility functions for interacting with the 'COMPADRE' and 'COMADRE' databases of matrix population models. Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.

Maintained by Owen Jones. Last updated 5 months ago.

40.7 match 11 stars 7.74 score 55 scripts 2 dependents

bioc

biomaRt:Interface to BioMart databases (i.e. Ensembl)

In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.

Maintained by Mike Smith. Last updated 1 days ago.

annotation bioconductor biomart ensembl

19.0 match 38 stars 15.99 score 13k scripts 230 dependents

giocomai

castarter:Content Analysis Starter Toolkit

Consistent approaches for basic web scraping, text mining and word frequency analysis of textual datasets

Maintained by Giorgio Comai. Last updated 11 hours ago.

tada text-mining

66.8 match 3 stars 4.52 score 2 scripts

ropensci

nodbi:'NoSQL' Database Connector

Simplified JSON document database access and manipulation, providing a common API across supported 'NoSQL' databases 'Elasticsearch', 'CouchDB', 'MongoDB' as well as 'SQLite/JSON1', 'PostgreSQL', and 'DuckDB'.

Maintained by Ralf Herold. Last updated 4 months ago.

database mongodb elasticsearch couchdb sqlite postgresql duckdb nosql json documents

35.2 match 78 stars 8.36 score 28 scripts 1 dependents

nikolaus77

rocker:Database Interface Class

'R6' class interface for handling relational database connections using 'DBI' package as backend. The class allows handling of connections to e.g. PostgreSQL, MariaDB and SQLite. The purpose is having an intuitive object allowing straightforward handling of SQL databases.

Maintained by Nikolaus Pawlowski. Last updated 3 years ago.

database dbi mariadb mysql postgres postgresql r6 sql sqlite

53.4 match 5 stars 5.24 score 7 scripts

r-dbi

odbc:Connect to ODBC Compatible Databases (using the DBI Interface)

A DBI-compatible interface to ODBC databases.

Maintained by Hadley Wickham. Last updated 12 days ago.

database odbc unixodbc cpp

17.3 match 396 stars 16.22 score 2.9k scripts 22 dependents

scharlton2

phreeqc:R Interface to Geochemical Modeling Software

A geochemical modeling program developed by the US Geological Survey that is designed to perform a wide variety of aqueous geochemical calculations, including speciation, batch-reaction, one-dimensional reactive-transport, and inverse geochemical calculations.

Maintained by S.R. Charlton. Last updated 17 days ago.

cpp

78.0 match 9 stars 3.37 score 60 scripts

r-dbi

RMariaDB:Database Interface and MariaDB Driver

Implements a DBI-compliant interface to MariaDB (<https://mariadb.org/>) and MySQL (<https://www.mysql.com/>) databases.

Maintained by Kirill Müller. Last updated 19 days ago.

database mariadb mysql cpp

21.0 match 134 stars 12.36 score 792 scripts 11 dependents

bioc

ensembldb:Utilities to create and use Ensembl-based annotation databases

The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.

Maintained by Johannes Rainer. Last updated 5 months ago.

genetics annotationdata sequencing coverage annotation bioconductor bioconductor-packages ensembl

17.3 match 35 stars 14.08 score 892 scripts 108 dependents

ips-lmu

emuR:Main Package of the EMU Speech Database Management System

Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.

Maintained by Markus Jochim. Last updated 1 years ago.

35.4 match 24 stars 6.89 score 135 scripts 1 dependents

ropensci

rentrez:'Entrez' in R

Provides an R interface to the NCBI's 'EUtils' API, allowing users to search databases like 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and 'PubMed' <https://pubmed.ncbi.nlm.nih.gov/>, process the results of those searches and pull data into their R sessions.

Maintained by David Winter. Last updated 4 years ago.

17.9 match 199 stars 13.60 score 784 scripts 95 dependents

pecanproject

PEcAn.DB:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 1 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

20.4 match 216 stars 11.88 score 127 scripts 27 dependents

adeckmyn

maps:Draw Geographical Maps

Display of maps. Projection code and larger maps are in separate packages ('mapproj' and 'mapdata').

Maintained by Alex Deckmyn. Last updated 2 months ago.

16.0 match 24 stars 14.70 score 19k scripts 490 dependents

pepijn-devries

ECOTOXr:Download and Extract Data from US EPA's ECOTOX Database

The US EPA ECOTOX database is a freely available database with a treasure of aquatic and terrestrial ecotoxicological data. As the online search interface doesn't come with an API, this package provides the means to easily access and search the database in R. To this end, all raw tables are downloaded from the EPA website and stored in a local SQLite database <doi:10.1016/j.chemosphere.2024.143078>.

Maintained by Pepijn de Vries. Last updated 4 days ago.

36.5 match 10 stars 6.20 score 6 scripts

bioc

LOBSTAHS:Lipid and Oxylipin Biomarker Screening through Adduct Hierarchy Sequences

LOBSTAHS is a multifunction package for screening, annotation, and putative identification of mass spectral features in large, HPLC-MS lipid datasets. In silico data for a wide range of lipids, oxidized lipids, and oxylipins can be generated from user-supplied structural criteria with a database generation function. LOBSTAHS then applies these databases to assign putative compound identities to features in any high-mass accuracy dataset that has been processed using xcms and CAMERA. Users can then apply a series of orthogonal screening criteria based on adduct ion formation patterns, chromatographic retention time, and other properties, to evaluate and assign confidence scores to this list of preliminary assignments. During the screening routine, LOBSTAHS rejects assignments that do not meet the specified criteria, identifies potential isomers and isobars, and assigns a variety of annotation codes to assist the user in evaluating the accuracy of each assignment.

Maintained by Henry Holm. Last updated 5 months ago.

immunooncology massspectrometry metabolomics lipidomics dataimport adduct algae bioconductor hplc-esi-ms lipid mass-spectrometry oxidative-stress-biomarkers oxidized-lipids oxylipins plankton

34.1 match 8 stars 6.56 score 9 scripts

civisanalytics

civis:R Client for the 'Civis Platform API'

A convenient interface for making requests directly to the 'Civis Platform API' <https://www.civisanalytics.com/platform/>. Full documentation available 'here' <https://civisanalytics.github.io/civis-r/>.

Maintained by Peter Cooman. Last updated 2 months ago.

28.3 match 16 stars 7.84 score 144 scripts

bioc

Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery

A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.

Maintained by Nan Xiao. Last updated 5 months ago.

software dataimport datarepresentation featureextraction cheminformatics biomedicalinformatics proteomics go systemsbiology bioconductor bioinformatics drug-discovery feature-extraction fingerprint molecular-descriptors protein-sequences

28.3 match 37 stars 7.81 score 29 scripts

r-spatial

sf:Simple Features for R

Support for simple feature access, a standardized way to encode and analyze spatial vector data. Binds to 'GDAL' <doi: 10.5281/zenodo.5884351> for reading and writing data, to 'GEOS' <doi: 10.5281/zenodo.11396894> for geometrical operations, and to 'PROJ' <doi: 10.5281/zenodo.5884394> for projection conversions and datum transformations. Uses by default the 's2' package for geometry operations on geodetic (long/lat degree) coordinates.

Maintained by Edzer Pebesma. Last updated 15 days ago.

gdal geos proj spatial cpp

9.7 match 1.4k stars 22.42 score 117k scripts 1.2k dependents

bioc

BiGGR:Constraint based modeling in R using metabolic reconstruction databases

This package provides an interface to simulate metabolic reconstruction from the BiGG database(http://bigg.ucsd.edu/) and other metabolic reconstruction databases. The package facilitates flux balance analysis (FBA) and the sampling of feasible flux distributions. Metabolic networks and estimated fluxes can be visualized with hypergraphs.

Maintained by Anand K. Gavai. Last updated 5 months ago.

systems biology pathway network graphandnetwork visualization metabolomics

45.9 match 4.67 score 58 scripts

duckdb

duckdb:DBI Package for the DuckDB Database Management System

The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and an R Database Interface (DBI) connector.

Maintained by Kirill Müller. Last updated 2 days ago.

database duckdb olap cpp

15.3 match 157 stars 13.80 score 1.7k scripts 46 dependents

r-dbi

RPostgres:C++ Interface to PostgreSQL

Fully DBI-compliant C++-backed interface to PostgreSQL <https://www.postgresql.org/>, an open-source relational database.

Maintained by Kirill Müller. Last updated 19 days ago.

database postgres postgresql cpp

14.2 match 338 stars 14.78 score 1.6k scripts 31 dependents

inbo

inbodb:Connect to and Retrieve Data from Databases on the INBO Server

A bundle of functions to connect to and retrieve data from databases on the INBO server, with dedicated functions to query some of these databases.

Maintained by Els Lommelen. Last updated 24 days ago.

database

34.1 match 6.16 score 114 scripts 1 dependents

kurthornik

mlbench:Machine Learning Benchmark Problems

A collection of artificial and real-world machine learning benchmark problems, including, e.g., several data sets from the UCI repository.

Maintained by Kurt Hornik. Last updated 3 months ago.

23.5 match 2 stars 8.93 score 5.0k scripts 55 dependents

cboettig

neonstore:NEON Data Store

The National Ecological Observatory Network (NEON) provides access to its numerous data products through its REST API, <https://data.neonscience.org/data-api/>. This package provides a high-level user interface for downloading and storing NEON data products. Unlike 'neonUtilities', this package will avoid repeated downloading, provides persistent storage, and improves performance. 'neonstore' can also construct a local 'duckdb' database of stacked tables, making it possible to work with tables that are far to big to fit into memory.

Maintained by Carl Boettiger. Last updated 11 months ago.

database ecology neon-data provenance

30.6 match 9 stars 6.67 score 143 scripts 11 dependents

kwb-r

kwb.db:Functions supporting data base access

This package contains some useful functions, especially for simplifying data transfer between MS Access databases and R. With the functions of this package it is not needed any more to open and close a database connection explicitely; this is done 'behind the scenes' in the functions. Instead of a database connection the path to the database file needs to be passed to the functions as an argument. The main functions are hsGetTable and hsPutTable which transfer data from an MS Access database to a data frame in R and save data from a data frame in R into a table in an MS Access database, respectively. Take care when getting time series data from an MS Access database, see therefore hsMdbTimeSeries. Use hsTables to get a list of tables that are available in a database and hsFields to get a list of table fields that are contained in a database table.

Maintained by Hauke Sonnenberg. Last updated 1 years ago.

data-import database-access database-connection rodbc

57.6 match 3.52 score 5 scripts 22 dependents

poissonconsulting

readwritesqlite:Enhanced Reading and Writing for 'SQLite' Databases

Reads and writes data frames to 'SQLite' databases while preserving time zones (for POSIXct columns), projections (for 'sfc' columns), units (for 'units' columns), levels (for factors and ordered factors) and classes for logical, Date and 'hms' columns. It also logs changes to tables and provides more informative error messages.

Maintained by Joe Thorley. Last updated 2 months ago.

dbi log metadata posixct read sfc sqlite units write

31.2 match 38 stars 6.42 score 11 scripts 1 dependents

bioc

CompoundDb:Creating and Using (Chemical) Compound Annotation Databases

CompoundDb provides functionality to create and use (chemical) compound annotation databases from a variety of different sources such as LipidMaps, HMDB, ChEBI or MassBank. The database format allows to store in addition MS/MS spectra along with compound information. The package provides also a backend for Bioconductor's Spectra package and allows thus to match experimetal MS/MS spectra against MS/MS spectra in the database. Databases can be stored in SQLite format and are thus portable.

Maintained by Johannes Rainer. Last updated 2 months ago.

massspectrometry metabolomics annotation databases mass-spectrometry

23.6 match 17 stars 8.40 score 69 scripts 1 dependents

ropensci

lingtypology:Linguistic Typology and Mapping

Provides R with the Glottolog database <https://glottolog.org/> and some more abilities for purposes of linguistic mapping. The Glottolog database contains the catalogue of languages of the world. This package helps researchers to make a linguistic maps, using philosophy of the Cross-Linguistic Linked Data project <https://clld.org/>, which allows for while at the same time facilitating uniform access to the data across publications. A tutorial for this package is available on GitHub pages <https://docs.ropensci.org/lingtypology/> and package vignette. Maps created by this package can be used both for the investigation and linguistic teaching. In addition, package provides an ability to download data from typological databases such as WALS, AUTOTYP and some others and to create your own database website.

Maintained by George Moroz. Last updated 5 months ago.

abvd afbo atlas autotype bivaltyp clld glottolog-database linguistic-maps linguistics phoible sails typology wals

20.6 match 51 stars 9.58 score 694 scripts

darwin-eu

CDMConnector:Connect to an OMOP Common Data Model

Provides tools for working with observational health data in the Observational Medical Outcomes Partnership (OMOP) Common Data Model format with a pipe friendly syntax. Common data model database table references are stored in a single compound object along with metadata.

Maintained by Adam Black. Last updated 18 days ago.

17.3 match 12 stars 11.39 score 502 scripts 12 dependents

winvector

rquery:Relational Query Generator for Data Manipulation at Scale

A piped query generator based on Edgar F. Codd's relational algebra, and on production experience using 'SQL' and 'dplyr' at big data scale. The design represents an attempt to make 'SQL' more teachable by denoting composition by a sequential pipeline notation instead of nested queries or functions. The implementation delivers reliable high performance data processing on large data systems such as 'Spark', databases, and 'data.table'. Package features include: data processing trees or pipelines as observable objects (able to report both columns produced and columns used), optimized 'SQL' generation as an explicit user visible table modeling step, plus explicit query reasoning and checking.

Maintained by John Mount. Last updated 2 years ago.

20.2 match 110 stars 9.53 score 126 scripts 3 dependents

sjmack

HLAtools:Toolkit for HLA Immunogenomics

A toolkit for the analysis and management of data for genes in the so-called "Human Leukocyte Antigen" (HLA) region. Functions extract reference data from the Anthony Nolan HLA Informatics Group/ImmunoGeneTics HLA 'GitHub' repository (ANHIG/IMGTHLA) <https://github.com/ANHIG/IMGTHLA>, validate Genotype List (GL) Strings, convert between UNIFORMAT and GL String Code (GLSC) formats, translate HLA alleles and GLSCs across ImmunoPolymorphism Database (IPD) IMGT/HLA Database release versions, identify differences between pairs of alleles at a locus, generate customized, multi-position sequence alignments, trim and convert allele-names across nomenclature epochs, and extend existing data-analysis methods.

Maintained by Steven Mack. Last updated 12 days ago.

30.6 match 4 stars 6.21 score 7 scripts 1 dependents

bioc

MetMashR:Metabolite Mashing with R

A package to merge, filter sort, organise and otherwise mash together metabolite annotation tables. Metabolite annotations can be imported from multiple sources (software) and combined using workflow steps based on S4 class templates derived from the `struct` package. Other modular workflow steps such as filtering, merging, splitting, normalisation and rest-api queries are included.

Maintained by Gavin Rhys Lloyd. Last updated 5 months ago.

workflowstep metabolomics kegg

32.7 match 2 stars 5.81 score 5 scripts

r-forge

CHNOSZ:Thermodynamic Calculations and Diagrams for Geochemistry

An integrated set of tools for thermodynamic calculations in aqueous geochemistry and geobiochemistry. Functions are provided for writing balanced reactions to form species from user-selected basis species and for calculating the standard molal properties of species and reactions, including the standard Gibbs energy and equilibrium constant. Calculations of the non-equilibrium chemical affinity and equilibrium chemical activity of species can be portrayed on diagrams as a function of temperature, pressure, or activity of basis species; in two dimensions, this gives a maximum affinity or predominance diagram. The diagrams have formatted chemical formulas and axis labels, and water stability limits can be added to Eh-pH, oxygen fugacity- temperature, and other diagrams with a redox variable. The package has been developed to handle common calculations in aqueous geochemistry, such as solubility due to complexation of metal ions, mineral buffers of redox or pH, and changing the basis species across a diagram ("mosaic diagrams"). CHNOSZ also implements a group additivity algorithm for the standard thermodynamic properties of proteins.

Maintained by Jeffrey Dick. Last updated 8 days ago.

fortran

20.0 match 9.46 score 238 scripts 4 dependents

bioc

KEGGREST:Client-side REST access to the Kyoto Encyclopedia of Genes and Genomes (KEGG)

A package that provides a client interface to the Kyoto Encyclopedia of Genes and Genomes (KEGG) REST API. Only for academic use by academic users belonging to academic institutions (see <https://www.kegg.jp/kegg/rest/>). Note that KEGGREST is based on KEGGSOAP by J. Zhang, R. Gentleman, and Marc Carlson, and KEGG (python package) by Aurelien Mazurie.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation pathways thirdpartyclient kegg bioconductor-package core-package

12.7 match 9 stars 14.46 score 688 scripts 775 dependents

bioc

LOLA:Locus overlap analysis for enrichment of genomic ranges

Provides functions for testing overlap of sets of genomic regions with public and custom region set (genomic ranges) databases. This makes it possible to do automated enrichment analysis for genomic region sets, thus facilitating interpretation of functional genomics and epigenomics data.

Maintained by Nathan Sheffield. Last updated 5 months ago.

genesetenrichment generegulation genomeannotation systemsbiology functionalgenomics chipseq methylseq sequencing

19.4 match 76 stars 9.34 score 160 scripts

marc-girondot

embryogrowth:Tools to Analyze the Thermal Reaction Norm of Embryo Growth

Tools to analyze the embryo growth and the sexualisation thermal reaction norms. See <doi:10.7717/peerj.8451> for tsd functions; see <doi:10.1016/j.jtherbio.2014.08.005> for thermal reaction norm of embryo growth.

Maintained by Marc Girondot. Last updated 7 months ago.

75.3 match 1 stars 2.40 score 252 scripts

patzaw

TKCat:Tailored Knowledge Catalog

Facilitate the management of data from knowledge resources that are frequently used alone or together in research environments. In 'TKCat', knowledge resources are manipulated as modeled database (MDB) objects. These objects provide access to the data tables along with a general description of the resource and a detail data model documenting the tables, their fields and their relationships. These MDBs are then gathered in catalogs that can be easily explored an shared. Finally, 'TKCat' provides tools to easily subset, filter and combine MDBs and create new catalogs suited for specific needs.

Maintained by Patrice Godard. Last updated 15 hours ago.

29.6 match 5 stars 6.08 score 27 scripts

ropensci

arkdb:Archive and Unarchive Databases Using Flat Files

Flat text files provide a robust, compressible, and portable way to store tables from databases. This package provides convenient functions for exporting tables from relational database connections into compressed text files and streaming those text files back into a database without requiring the whole table to fit in working memory.

Maintained by Carl Boettiger. Last updated 1 years ago.

archiving database dbi peer-reviewed

26.1 match 79 stars 6.86 score 37 scripts

patzaw

BED:Biological Entity Dictionary (BED)

An interface for the 'Neo4j' database providing mapping between different identifiers of biological entities. This Biological Entity Dictionary (BED) has been developed to address three main challenges. The first one is related to the completeness of identifier mappings. Indeed, direct mapping information provided by the different systems are not always complete and can be enriched by mappings provided by other resources. More interestingly, direct mappings not identified by any of these resources can be indirectly inferred by using mappings to a third reference. For example, many human Ensembl gene ID are not directly mapped to any Entrez gene ID but such mappings can be inferred using respective mappings to HGNC ID. The second challenge is related to the mapping of deprecated identifiers. Indeed, entity identifiers can change from one resource release to another. The identifier history is provided by some resources, such as Ensembl or the NCBI, but it is generally not used by mapping tools. The third challenge is related to the automation of the mapping process according to the relationships between the biological entities of interest. Indeed, mapping between gene and protein ID scopes should not be done the same way than between two scopes regarding gene ID. Also, converting identifiers from different organisms should be possible using gene orthologs information. The method has been published by Godard and van Eyll (2018) <doi:10.12688/f1000research.13925.3>.

Maintained by Patrice Godard. Last updated 3 months ago.

26.0 match 8 stars 6.85 score 25 scripts

openbiox

UCSCXenaShiny:Interactive Analysis of UCSC Xena Data

Provides functions and a Shiny application for downloading, analyzing and visualizing datasets from UCSC Xena (<http://xena.ucsc.edu/>), which is a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others.

Maintained by Shixiang Wang. Last updated 4 months ago.

cancer-dataset shiny-apps ucsc-xena

20.5 match 96 stars 8.54 score 35 scripts

framverse

framrsquared:FRAM Database Interface

A convenient tool for interfacing with FRAM access databases in R environments.

Maintained by Ty Garber. Last updated 2 months ago.

34.5 match 6 stars 5.06 score 9 scripts

luckinet

arealDB:Harmonise and Integrate Heterogeneous Areal Data

Many relevant applications in the environmental and socioeconomic sciences use areal data, such as biodiversity checklists, agricultural statistics, or socioeconomic surveys. For applications that surpass the spatial, temporal or thematic scope of any single data source, data must be integrated from several heterogeneous sources. Inconsistent concepts, definitions, or messy data tables make this a tedious and error-prone process. 'arealDB' tackles those problems and helps the user to integrate a harmonised databases of areal data. Read the paper at Ehrmann, Seppelt & Meyer (2020) <doi:10.1016/j.envsoft.2020.104799>.

Maintained by Steffen Ehrmann. Last updated 1 months ago.

areal-data database

32.2 match 2 stars 5.41 score 15 scripts

ropensci

bikedata:Download and Aggregate Data from Public Hire Bicycle Systems

Download and aggregate data from all public hire bicycle systems which provide open data, currently including 'Santander' Cycles in London, U.K.; from the U.S.A., 'Ford GoBike' in San Francisco CA, 'citibike' in New York City NY, 'Divvy' in Chicago IL, 'Capital Bikeshare' in Washington DC, 'Hubway' in Boston MA, 'Metro' in Los Angeles LA, 'Indego' in Philadelphia PA, and 'Nice Ride' in Minnesota; 'Bixi' from Montreal, Canada; and 'mibici' from Guadalajara, Mexico.

Maintained by Mark Padgham. Last updated 1 years ago.

bicycle-hire-systems bike-hire-systems bike-hire bicycle-hire database bike-data peer-reviewed cpp

28.9 match 83 stars 5.97 score 28 scripts

ropensci

restez:Create and Query a Local Copy of 'GenBank' in R

Download large sections of 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and generate a local SQL-based database. A user can then query this database using 'restez' functions or through 'rentrez' <https://CRAN.R-project.org/package=rentrez> wrappers.

Maintained by Joel H. Nitta. Last updated 9 days ago.

dna entrez genbank sequence

24.4 match 26 stars 7.01 score 175 scripts 1 dependents

thewileylab

ReviewR:A Light-Weight, Portable Tool for Reviewing Individual Patient Records

A portable Shiny tool to explore patient-level electronic health record data and perform chart review in a single integrated framework. This tool supports browsing clinical data in many different formats including multiple versions of the 'OMOP' common data model as well as the 'MIMIC-III' data model. In addition, chart review information is captured and stored securely via the Shiny interface in a 'REDCap' (Research Electronic Data Capture) project using the 'REDCap' API. See the 'ReviewR' website for additional information, documentation, and examples.

Maintained by David Mayer. Last updated 2 years ago.

27.0 match 24 stars 6.33 score 6 scripts

bioc

ChemmineR:Cheminformatics Toolkit for R

ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics cpp

18.1 match 14 stars 9.42 score 253 scripts 12 dependents

rfhb

ctrdata:Retrieve and Analyze Clinical Trials in Public Registers

A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', <https://www.clinicaltrialsregister.eu/>), 'ClinicalTrials.gov' (<https://clinicaltrials.gov/> and also translating queries the retired classic interface), the 'ISRCTN' (<http://www.isrctn.com/>) and the 'European Union Clinical Trials Information System' ('CTIS', <https://euclinicaltrials.eu/>). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for meta-analysis and trend-analysis of the design and conduct as well as of the results of clinical trials across registers.

Maintained by Ralf Herold. Last updated 4 hours ago.

clinical-data clinical-research clinical-studies clinical-trials ctgov database duckdb mongodb nodbi postgresql register sqlite studies trial

21.5 match 45 stars 7.92 score 32 scripts

poissonconsulting

subfoldr2:Save and Load R Objects

Facilitates saving and loading R objects, data frames, tables, plots, text blocks and numbers to subfolders.

Maintained by Joe Thorley. Last updated 13 days ago.

45.5 match 2 stars 3.70 score 5 scripts

bioc

bioassayR:Cross-target analysis of small molecule bioactivity

bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.

Maintained by Thomas Girke. Last updated 5 months ago.

immunooncology microtitreplateassay cellbasedassays visualization infrastructure dataimport bioinformatics proteomics metabolomics

24.9 match 5 stars 6.70 score 46 scripts

ropensci

elastic:General Purpose Interface to 'Elasticsearch'

Connect to 'Elasticsearch', a 'NoSQL' database built on the 'Java' Virtual Machine. Interacts with the 'Elasticsearch' 'HTTP' API (<https://www.elastic.co/elasticsearch/>), including functions for setting connection details to 'Elasticsearch' instances, loading bulk data, searching for documents with both 'HTTP' query variables and 'JSON' based body requests. In addition, 'elastic' provides functions for interacting with API's for 'indices', documents, nodes, clusters, an interface to the cat API, and more.

Maintained by Scott Chamberlain. Last updated 2 years ago.

database elasticsearch http api search nosql java json documents data-science database-wrapper etl

18.0 match 247 stars 8.98 score 151 scripts 1 dependents

ropensci

dittodb:A Test Environment for Database Requests

Testing and documenting code that communicates with remote databases can be painful. Although the interaction with R is usually relatively simple (e.g. data(frames) passed to and from a database), because they rely on a separate service and the data there, testing them can be difficult to set up, unsustainable in a continuous integration environment, or impossible without replicating an entire production cluster. This package addresses that by allowing you to make recordings from your database interactions and then play them back while testing (or in other contexts) all without needing to spin up or have access to the database your code would typically connect to.

Maintained by Jonathan Keane. Last updated 11 months ago.

19.7 match 82 stars 8.04 score 49 scripts

ohdsi

PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model

A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.

Maintained by Egill Fridgeirsson. Last updated 8 days ago.

hades openjdk

14.2 match 190 stars 10.85 score 297 scripts

chiliubio

microeco:Microbial Community Ecology Data Analysis

A series of statistical and plotting approaches in microbial community ecology based on the R6 class. The classes are designed for data preprocessing, taxa abundance plotting, alpha diversity analysis, beta diversity analysis, differential abundance test, null model analysis, network analysis, machine learning, environmental data analysis and functional analysis.

Maintained by Chi Liu. Last updated 4 days ago.

14.9 match 219 stars 10.11 score 211 scripts 3 dependents

skranz

dbmisc:Tools for working with SQLite in R

Tools for working with SQLite in R, in particular support for simple YAML schemas.

Maintained by Sebastian Kranz. Last updated 2 years ago.

database schema sql sqlite

34.1 match 27 stars 4.41 score 16 scripts 4 dependents

ropensci

openalexR:Getting Bibliographic Records from 'OpenAlex' Database Using 'DSL' API

A set of tools to extract bibliographic content from 'OpenAlex' database using API <https://docs.openalex.org>.

Maintained by Massimo Aria. Last updated 26 days ago.

bibliographic-data bibliographic-database bibliometrics bibliometrix science-mapping

14.5 match 107 stars 10.24 score 194 scripts 5 dependents

inbo

forrescalc:Calculation of Aggregated Values on Dendrometry, Regeneration and Vegetation of Forests, Starting from Individual Tree Measures from Fieldmap

A collection of functions to load and aggregate measurements related to dendrometry, rejuvenation and vegetation, and to access plot level results from Flemish forest reserves in data package forresdat.

Maintained by Els Lommelen. Last updated 6 months ago.

38.2 match 3.79 score 123 scripts

bioc

methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results

methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.

Maintained by Altuna Akalin. Last updated 15 days ago.

dnamethylation sequencing methylseq genome-biology methylation statistical-analysis visualization curl bzip2 xz-utils zlib cpp

11.9 match 220 stars 11.80 score 578 scripts 3 dependents

ropensci

rebird:R Client for the eBird Database of Bird Observations

A programmatic client for the eBird database (<https://ebird.org/home>), including functions for searching for bird observations by geographic location (latitude, longitude), eBird hotspots, location identifiers, by notable sightings, by region, and by taxonomic name.

Maintained by Sebastian Pardo. Last updated 1 months ago.

birds birding ebird database data biology observations sightings ornithology ebird-api ebird-webservices spocc

13.4 match 90 stars 10.43 score 73 scripts 6 dependents

tidyverse

blob:A Simple S3 Class for Representing Vectors of Binary Data ('BLOBS')

R's raw vector is useful for storing a single binary object. What if you want to put a vector of them in a data frame? The 'blob' package provides the blob object, a list of raw vectors, suitable for use as a column in data frame.

Maintained by Kirill Müller. Last updated 3 months ago.

database

10.0 match 45 stars 13.82 score 157 scripts 1.4k dependents

malaria-atlas-project

malariaAtlas:An R Interface to Open-Access Malaria Data, Hosted by the 'Malaria Atlas Project'

A suite of tools to allow you to download all publicly available parasite rate survey points, mosquito occurrence points and raster surfaces from the 'Malaria Atlas Project' <https://malariaatlas.org/> servers as well as utility functions for plotting the downloaded data.

Maintained by Mauricio van den Berg. Last updated 8 months ago.

database malaria opendata raster

15.1 match 44 stars 9.10 score 118 scripts 3 dependents

bioc

TFBSTools:Software Package for Transcription Factor Binding Site (TFBS) Analysis

TFBSTools is a package for the analysis and manipulation of transcription factor binding sites. It includes matrices conversion between Position Frequency Matirx (PFM), Position Weight Matirx (PWM) and Information Content Matrix (ICM). It can also scan putative TFBS from sequence/alignment, query JASPAR database and provides a wrapper of de novo motif discovery software.

Maintained by Ge Tan. Last updated 3 days ago.

motifannotation generegulation motifdiscovery transcription alignment

10.8 match 28 stars 12.36 score 1.1k scripts 18 dependents

r-dbi

bigrquery:An Interface to Google's 'BigQuery' 'API'

Easily talk to Google's 'BigQuery' database from R.

Maintained by Hadley Wickham. Last updated 19 days ago.

bigquery database cpp

10.6 match 520 stars 12.55 score 1.8k scripts 4 dependents

ropensci

neotoma:Access to the Neotoma Paleoecological Database Through R

NOTE: This package is deprecated. Please use the neotoma2 package described at https://github.com/NeotomaDB/neotoma2. Access paleoecological datasets from the Neotoma Paleoecological Database using the published API (<http://wnapi.neotomadb.org/>), only containing datasets uploaded prior to June 2020. The functions in this package access various pre-built API functions and attempt to return the results from Neotoma in a usable format for researchers and the public.

Maintained by Simon J. Goring. Last updated 2 years ago.

neotoma neotoma-apis neotoma-database nsf paleoecology

26.3 match 30 stars 5.04 score 145 scripts

bioc

multiMiR:Integration of multiple microRNA-target databases with their disease and drug associations

A collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).

Maintained by Spencer Mahaffey. Last updated 5 months ago.

mirnadata homo_sapiens_data mus_musculus_data rattus_norvegicus_data organismdata microrna-sequence sql

15.6 match 20 stars 8.45 score 141 scripts

bioc

RcisTarget:RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

RcisTarget identifies transcription factor binding motifs (TFBS) over-represented on a gene list. In a first step, RcisTarget selects DNA motifs that are significantly over-represented in the surroundings of the transcription start site (TSS) of the genes in the gene-set. This is achieved by using a database that contains genome-wide cross-species rankings for each motif. The motifs that are then annotated to TFs and those that have a high Normalized Enrichment Score (NES) are retained. Finally, for each motif and gene-set, RcisTarget predicts the candidate target genes (i.e. genes in the gene-set that are ranked above the leading edge).

Maintained by Gert Hulselmans. Last updated 5 months ago.

generegulation motifannotation transcriptomics transcription genesetenrichment genetarget

13.8 match 37 stars 9.47 score 191 scripts

jiang-junyao

CACIMAR:cross-species analysis of cell identities, markers and regulations

A toolkit to perform cross-species analysis based on scRNA-seq data. CACIMAR contains 5 main features. (1) identify Markers in each cluster. (2) Cell type annotaion (3) identify conserved markers. (4) identify conserved cell types. (5) identify conserved modules of regulatory networks.

Maintained by Junyao Jiang. Last updated 3 months ago.

cross-species-analysis scrna-seq

24.5 match 12 stars 5.26 score 6 scripts

dwbapst

paleotree:Paleontological and Phylogenetic Analyses of Evolution

Provides tools for transforming, a posteriori time-scaling, and modifying phylogenies containing extinct (i.e. fossil) lineages. In particular, most users are interested in the functions timePaleoPhy, bin_timePaleoPhy, cal3TimePaleoPhy and bin_cal3TimePaleoPhy, which date cladograms of fossil taxa using stratigraphic data. This package also contains a large number of likelihood functions for estimating sampling and diversification rates from different types of data available from the fossil record (e.g. range data, occurrence data, etc). paleotree users can also simulate diversification and sampling in the fossil record using the function simFossilRecord, which is a detailed simulator for branching birth-death-sampling processes composed of discrete taxonomic units arranged in ancestor-descendant relationships. Users can use simFossilRecord to simulate diversification in incompletely sampled fossil records, under various models of morphological differentiation (i.e. the various patterns by which morphotaxa originate from one another), and with time-dependent, longevity-dependent and/or diversity-dependent rates of diversification, extinction and sampling. Additional functions allow users to translate simulated ancestor-descendant data from simFossilRecord into standard time-scaled phylogenies or unscaled cladograms that reflect the relationships among taxon units.

Maintained by David W. Bapst. Last updated 8 months ago.

17.0 match 21 stars 7.53 score 216 scripts 2 dependents

tidymodels

modeldb:Fits Models Inside the Database

Uses 'dplyr' and 'tidyeval' to fit statistical models inside the database. It currently supports KMeans and linear regression models.

Maintained by Max Kuhn. Last updated 1 years ago.

database dbplyr dplyr ggplot2 modeling rlang sql tidyeval visualization

16.8 match 79 stars 7.59 score 62 scripts

tomoakin

RPostgreSQL:R Interface to the 'PostgreSQL' Database System

Database interface and 'PostgreSQL' driver for 'R'. This package provides a Database Interface 'DBI' compliant driver for 'R' to access 'PostgreSQL' database systems. In order to build and install this package from source, 'PostgreSQL' itself must be present your system to provide 'PostgreSQL' functionality via its libraries and header files. These files are provided as 'postgresql-devel' package under some Linux distributions. On 'macOS' and 'Microsoft Windows' system the attached 'libpq' library source will be used.

Maintained by Tomoaki Nishiyama. Last updated 1 years ago.

postgresql

11.1 match 65 stars 11.52 score 4.5k scripts 19 dependents

cidree

rpostgis:R Interface to a 'PostGIS' Database

Provides an interface between R and 'PostGIS'-enabled 'PostgreSQL' databases to transparently transfer spatial data. Both vector (points, lines, polygons) and raster data are supported in read and write modes. Also provides convenience functions to execute common procedures in 'PostgreSQL/PostGIS'.

Maintained by Adrian Cidre Gonzalez. Last updated 3 months ago.

postgis postgresql-database

16.5 match 79 stars 7.67 score 244 scripts 2 dependents

pascalcrepey

HospitalNetwork:Building Networks of Hospitals Through Patients Transfers

Set of tools to help interested researchers to build hospital networks from data on hospitalized patients transferred between hospitals. Methods provided have been used in Donker T, Wallinga J, Grundmann H. (2010) <doi:10.1371/journal.pcbi.1000715>, and Nekkab N, Crépey P, Astagneau P, Opatowski L, Temime L. (2020) <doi:10.1038/s41598-020-71212-6>.

Maintained by Pascal Crépey. Last updated 3 months ago.

hospital-networks patient-database patients-transfers

24.6 match 7 stars 5.10 score 12 scripts

klausvigo

kknn:Weighted k-Nearest Neighbors

Weighted k-Nearest Neighbors for Classification, Regression and Clustering.

Maintained by Klaus Schliep. Last updated 4 years ago.

nearest-neighbor

11.2 match 23 stars 11.08 score 4.6k scripts 41 dependents

bioc

DECIPHER:Tools for curating, analyzing, and manipulating biological sequences

A toolset for deciphering and managing biological sequences.

Maintained by Erik Wright. Last updated 5 days ago.

clustering genetics sequencing dataimport visualization microarray qualitycontrol qpcr alignment wholegenome microbiome immunooncology geneprediction openmp

14.7 match 8.40 score 1.1k scripts 14 dependents

ouhscbbmc

REDCapR:Interaction Between R and REDCap

Encapsulates functions to streamline calls from R to the REDCap API. REDCap (Research Electronic Data CAPture) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The Application Programming Interface (API) offers an avenue to access and modify data programmatically, improving the capacity for literate and reproducible programming.

Maintained by Will Beasley. Last updated 2 months ago.

redcap redcap-api

9.9 match 118 stars 12.36 score 438 scripts 6 dependents

r-forge

survey:Analysis of Complex Survey Samples

Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.

Maintained by "Thomas Lumley". Last updated 6 months ago.

cpp

8.8 match 1 stars 13.94 score 13k scripts 232 dependents

traitecoevo

austraits:Helpful functions to access the AusTraits database and wrangle data from other traits.build databases

`austraits` allow users to **access, explore and wrangle data** from traits.build relational databases. It is also an R interface to AusTraits, the Australian plant trait database. This package contains functions for joining data from various tables, filtering to specific records, combining multiple databases and visualising the distribution of the data. We expect this package will assist users in working with `traits.build` databases.

Maintained by Fonti Kar. Last updated 2 months ago.

australia database plants traits

20.5 match 22 stars 5.93 score 43 scripts 1 dependents

ropensci

dwctaxon:Edit and Validate Darwin Core Taxon Data

Edit and validate taxonomic data in compliance with Darwin Core standards (Darwin Core 'Taxon' class <https://dwc.tdwg.org/terms/#taxon>).

Maintained by Joel H. Nitta. Last updated 8 months ago.

database

19.3 match 6 stars 6.13 score 28 scripts

rdpeng

filehash:Simple Key-Value Database

Implements a simple key-value style database where character string keys are associated with data values that are stored on the disk. A simple interface is provided for inserting, retrieving, and deleting data from the database. Utilities are provided that allow 'filehash' databases to be treated much like environments and lists are already used in R. These utilities are provided to encourage interactive and exploratory analysis on large datasets. Three different file formats for representing the database are currently available and new formats can easily be incorporated by third parties for use in the 'filehash' framework.

Maintained by Roger D. Peng. Last updated 2 years ago.

14.4 match 24 stars 8.16 score 78 scripts 11 dependents

alexpate30

rcprd:Extraction and Management of Clinical Practice Research Datalink Data

Simplify the process of extracting and processing Clinical Practice Research Datalink (CPRD) data in order to build datasets ready for statistical analysis. This process is difficult in 'R', as the raw data is very large and cannot be read into the R workspace. 'rcprd' utilises 'RSQLite' to create 'SQLite' databases which are stored on the hard disk. These are then queried to extract the required information for a cohort of interest, and create datasets ready for statistical analysis. The processes follow closely that from the 'rEHR' package, see Springate et al., (2017) <doi:10.1371/journal.pone.0171784>.

Maintained by Alexander Pate. Last updated 18 days ago.

21.4 match 2 stars 5.48 score 5 scripts

mrcieu

TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database

A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.

Maintained by Gibran Hemani. Last updated 10 days ago.

10.4 match 467 stars 11.23 score 1.7k scripts 1 dependents

ropenspain

mapSpain:Administrative Boundaries of Spain

Administrative Boundaries of Spain at several levels (Autonomous Communities, Provinces, Municipalities) based on the 'GISCO' 'Eurostat' database <https://ec.europa.eu/eurostat/web/gisco> and 'CartoBase SIANE' from 'Instituto Geografico Nacional' <https://www.ign.es/>. It also provides a 'leaflet' plugin and the ability of downloading and processing static tiles.

Maintained by Diego Hernangómez. Last updated 1 months ago.

ropenspain tiles maps spatial municipalities spain gisco provinces ign administrative-boundaries ccaa static-tiles ggplot2 gis

13.1 match 42 stars 8.83 score 244 scripts 2 dependents

ropensci

taxizedb:Tools for Working with 'Taxonomic' Databases

Tools for working with 'taxonomic' databases, including utilities for downloading databases, loading them into various 'SQL' databases, cleaning up files, and providing a 'SQL' connection that can be used to do 'SQL' queries directly or used in 'dplyr'.

Maintained by Tamás Stirling. Last updated 1 months ago.

itis taxize taxonomic-databases taxonomy

19.7 match 31 stars 5.86 score 86 scripts 1 dependents

koenniem

mpathsenser:Process and Analyse Data from m-Path Sense

Overcomes one of the major challenges in mobile (passive) sensing, namely being able to pre-process the raw data that comes from a mobile sensing app, specifically 'm-Path Sense' <https://m-path.io>. The main task of 'mpathsenser' is therefore to read 'm-Path Sense' JSON files into a database and provide several convenience functions to aid in data processing.

Maintained by Koen Niemeijer. Last updated 19 days ago.

mobile-sensing

25.7 match 1 stars 4.48 score 6 scripts

prestodb

RPresto:DBI Connector to Presto

Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: <https://prestodb.io/>.

Maintained by Jarod G.R. Meng. Last updated 30 days ago.

11.5 match 131 stars 9.86 score 25 scripts 4 dependents

cran

RODBC:ODBC Database Access

An ODBC database interface.

Maintained by Brian Ripley. Last updated 3 months ago.

unixodbc

15.2 match 10 stars 7.41 score 38 dependents

ropensci

Rpolyhedra:Polyhedra Database

A polyhedra database scraped from various sources as R6 objects and 'rgl' visualizing capabilities.

Maintained by Alejandro Baranek. Last updated 5 months ago.

geometry polyhedra-database rgl

18.1 match 12 stars 6.21 score 30 scripts

openanalytics

editbl:'DT' Extension for CRUD (Create, Read, Update, Delete) Applications in 'shiny'

The core of this package is a function eDT() which enhances DT::datatable() such that it can be used to interactively modify data in 'shiny'. By the use of generic 'dplyr' methods it supports many types of data storage, with relational databases ('dbplyr') being the main use case.

Maintained by Jasper Schelfhout. Last updated 1 months ago.

17.1 match 23 stars 6.52 score 12 scripts

grunwaldlab

metacoder:Tools for Parsing, Manipulating, and Graphing Taxonomic Abundance Data

Reads, plots, and manipulates large taxonomic data sets, like those generated from modern high-throughput sequencing, such as metabarcoding (i.e. amplification metagenomics, 16S metagenomics, etc). It provides a tree-based visualization called "heat trees" used to depict statistics for every taxon in a taxonomy using color and size. It also provides various functions to do common tasks in microbiome bioinformatics on data in the 'taxmap' format defined by the 'taxa' package. The 'metacoder' package is described in the publication by Foster et al. (2017) <doi:10.1371/journal.pcbi.1005404>.

Maintained by Zachary Foster. Last updated 1 months ago.

community-diversity hierarchical metabarcoding pcr taxonomy trees cpp

11.5 match 140 stars 9.64 score 328 scripts

openml

OpenML:Open Machine Learning and Open Data Platform

We provide an R interface to 'OpenML.org' which is an online machine learning platform where researchers can access open data, download and upload data sets, share their machine learning tasks and experiments and organize them online to work and collaborate with other researchers. The R interface allows to query for data sets with specific properties, and allows the downloading and uploading of data sets, tasks, flows and runs. See <https://www.openml.org/guide/api> for more information.

Maintained by Giuseppe Casalicchio. Last updated 10 months ago.

arff benchmarking benchmarking-suite classification data-science database dataset datasets machine-learning machine-learning-algorithms open-data open-science opendata openml openscience regression reproducible-research statistics

10.0 match 97 stars 11.04 score 7.1k scripts

mikeasilva

simplegraphdb:A Simple Graph Database

This is a graph database in 'SQLite'. It is inspired by Denis Papathanasiou's Python simple-graph project on 'GitHub'.

Maintained by Michael Silva. Last updated 4 years ago.

graph sqlite sqlite-database

29.4 match 7 stars 3.75 score 16 scripts

equitable-equations

fqar:Floristic Quality Assessment Tools for R

Tools for downloading and analyzing floristic quality assessment data. See Freyman et al. (2015) <doi:10.1111/2041-210X.12491> for more information about floristic quality assessment and the associated database.

Maintained by Andrew Gard. Last updated 2 months ago.

18.6 match 5 stars 5.88 score 5 scripts

dieghernan

tidyterra:'tidyverse' Methods and 'ggplot2' Helpers for 'terra' Objects

Extension of the 'tidyverse' for 'SpatRaster' and 'SpatVector' objects of the 'terra' package. It includes also new 'geom_' functions that provide a convenient way of visualizing 'terra' objects with 'ggplot2'.

Maintained by Diego Hernangómez. Last updated 4 hours ago.

terra ggplot-extension r-spatial rspatial

8.0 match 191 stars 13.62 score 1.9k scripts 25 dependents

bioc

signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis

This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.

Maintained by Brendan Gongol. Last updated 5 months ago.

software geneexpression go kegg networkenrichment sequencing coverage differentialexpression cpp

15.1 match 17 stars 7.18 score 74 scripts 1 dependents

moosa-r

rbioapi:User-Friendly R Interface to Biologic Web Services' API

Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.

Maintained by Moosa Rezwani. Last updated 1 months ago.

api-client bioinformatics biology enrichment enrichment-analysis enrichr jaspar mieaa over-representation-analysis panther reactome string uniprot

14.1 match 20 stars 7.60 score 55 scripts

sumtxt

wiesbaden:Access Databases from the Federal Statistical Office of Germany

Retrieve and import data from different databases of the Federal Statistical Office of Germany (DESTATIS) using their SOAP XML web service <https://www-genesis.destatis.de/>.

Maintained by Moritz Marbach. Last updated 8 months ago.

api-client regionalstatistik

16.4 match 52 stars 6.55 score 17 scripts

sbgraves237

Ecdat:Data Sets for Econometrics

Data sets for econometrics, including political science.

Maintained by Spencer Graves. Last updated 4 months ago.

14.7 match 2 stars 7.25 score 740 scripts 3 dependents

rstudio

pointblank:Data Validation and Organization of Metadata for Local and Remote Tables

Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.

Maintained by Richard Iannone. Last updated 9 days ago.

data-assertions data-checker data-dictionaries data-frames data-inference data-management data-profiler data-quality data-validation data-verification database-tables easy-to-understand reporting-tool schema-validation testing-tools yaml-configuration

10.0 match 932 stars 10.59 score 284 scripts

divdyn

divDyn:Diversity Dynamics using Fossil Sampling Data

Functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.

Maintained by Adam T. Kocsis. Last updated 4 months ago.

diversity extinction fossil-data occurrences origination paleobiology cpp

16.1 match 11 stars 6.48 score 137 scripts

cost-fp1304-profound

ProfoundData:Downloading and Exploring Data from the PROFOUND Database

Provides an R interface for the PROFOUND database <doi:10.5880/PIK.2019.008>. The PROFOUND database contains a wide range of data to evaluate vegetation models and simulate climate impacts at the forest stand scale. It includes 9 forest sites across Europe, and provides for them a site description as well as soil, climate, CO2, Nitrogen deposition, tree-level, forest stand-level and remote sensing data. Moreover, for a subset of 5 sites, also time series of carbon fluxes, energy balances and soil water are available.

Maintained by Florian Hartig. Last updated 5 years ago.

18.7 match 9 stars 5.58 score 14 scripts

usepa

tcpl:ToxCast Data Analysis Pipeline

The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.

Maintained by Jason Brown. Last updated 2 days ago.

ccte comptox ord

10.9 match 36 stars 9.41 score 90 scripts

tanaylab

misha:Toolkit for Analysis of Genomic Data

A toolkit for analysis of genomic data. The 'misha' package implements an efficient data structure for storing genomic data, and provides a set of functions for data extraction, manipulation and analysis. Some of the 2D genome algorithms were described in Yaffe and Tanay (2011) <doi:10.1038/ng.947>.

Maintained by Aviezer Lifshitz. Last updated 5 days ago.

genomic-data-analysis cpp

17.6 match 4 stars 5.86 score

ropensci

taxize:Taxonomic Information from Around the Web

Interacts with a suite of web application programming interfaces (API) for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more. Some of the services supported include 'NCBI E-utilities' (<https://www.ncbi.nlm.nih.gov/books/NBK25501/>), 'Encyclopedia of Life' (<https://eol.org/docs/what-is-eol/data-services>), 'Global Biodiversity Information Facility' (<https://techdocs.gbif.org/en/openapi/>), and many more. Links to the API documentation for other supported services are available in the documentation for their respective functions in this package.

Maintained by Zachary Foster. Last updated 11 days ago.

taxonomy biology nomenclature json api web api-client identifiers species names api-wrapper biodiversity darwincore data taxize

7.5 match 274 stars 13.63 score 1.6k scripts 23 dependents

phuse-org

sendigR:Enable Cross-Study Analysis of 'CDISC' 'SEND' Datasets

A system enables cross study Analysis by extracting and filtering study data for control animals from 'CDISC' 'SEND' Study Repository. These data types are supported: Body Weights, Laboratory test results and Microscopic findings. These database types are supported: 'SQLite' and 'Oracle'.

Maintained by Wenxian Wang. Last updated 9 days ago.

16.2 match 12 stars 6.28 score 6 scripts

danchaltiel

EDCimport:Import Data from EDC Software

A convenient toolbox to import data exported from Electronic Data Capture (EDC) software 'TrialMaster'.

Maintained by Dan Chaltiel. Last updated 5 days ago.

16.9 match 6.01 score 12 scripts

bioc

AlpsNMR:Automated spectraL Processing System for NMR

Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.

Maintained by Sergio Oller Moreno. Last updated 5 months ago.

software preprocessing visualization classification cheminformatics metabolomics dataimport

13.3 match 15 stars 7.59 score 12 scripts 1 dependents

poissonconsulting

dbflobr:Read and Write Files to SQLite Databases

Reads and writes files to SQLite databases <https://www.sqlite.org/index.html> as flobs (a flob is a blob that preserves the file extension).

Maintained by Evan Amies-Galonski. Last updated 2 months ago.

blob databases flob sqlite

17.3 match 6 stars 5.86 score 5 scripts

ssi-dk

SCDB:Easily Access and Maintain Time-Based Versioned Data (Slowly-Changing-Dimension)

A collection of functions that enable easy access and updating of a database of data over time. More specifically, the package facilitates type-2 history for data-warehouses and provides a number of Quality of life improvements for working on SQL databases with R. For reference see Ralph Kimball and Margy Ross (2013, ISBN 9781118530801).

Maintained by Rasmus Skytte Randløv. Last updated 17 days ago.

13.7 match 6 stars 7.38 score 11 scripts 1 dependents

r-lib

tzdb:Time Zone Database Information

Provides an up-to-date copy of the Internet Assigned Numbers Authority (IANA) Time Zone Database. It is updated periodically to reflect changes made by political bodies to time zone boundaries, UTC offsets, and daylight saving time rules. Additionally, this package provides a C++ interface for working with the 'date' library. 'date' provides comprehensive support for working with dates and date-times, which this package exposes to make it easier for other R packages to utilize. Headers are provided for calendar specific calculations, along with a limited interface for time zone manipulations.

Maintained by Davis Vaughan. Last updated 1 days ago.

cpp

9.2 match 7 stars 10.90 score 38 scripts 2.4k dependents

ropensci

taxadb:A High-Performance Local Taxonomic Database Interface

Creates a local database of many commonly used taxonomic authorities and provides functions that can quickly query this data.

Maintained by Carl Boettiger. Last updated 11 months ago.

13.0 match 43 stars 7.68 score 53 scripts 1 dependents

gergness

srvyr:'dplyr'-Like Syntax for Summary Statistics of Survey Data

Use piping, verbs like 'group_by' and 'summarize', and other 'dplyr' inspired syntactic style when calculating summary statistics on survey data using functions from the 'survey' package.

Maintained by Greg Freedman Ellis. Last updated 1 months ago.

survey

7.1 match 215 stars 13.88 score 1.8k scripts 15 dependents

bioc

TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data

The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.

Maintained by Tiago Chedraoui Silva. Last updated 26 days ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network sequencing survival software bioc bioconductor gdc integrative-analysis tcga tcga-data tcgabiolinks

6.8 match 305 stars 14.45 score 1.6k scripts 6 dependents

quandl

Quandl:API Wrapper for Quandl.com

Functions for interacting directly with the Quandl API to offer data in a number of formats usable in R, downloading a zip with all data from a Quandl database, and the ability to search. This R package uses the Quandl API. For more information go to <https://docs.quandl.com>. For more help on the package itself go to <https://www.quandl.com/tools/r>.

Maintained by Dave Dotson. Last updated 3 years ago.

api-client quandl-api

10.1 match 137 stars 9.74 score 980 scripts 3 dependents

cran

ibmdbR:IBM in-Database Analytics for R

Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of parallel processing in the underlying database. For executing R-functions in a multi-node environment in parallel the idaTApply() function requires the 'SparkR' package (<https://spark.apache.org/docs/latest/sparkr.html>). The optional 'ggplot2' package is needed for the plot.idaLm() function only.

Maintained by Shaikh Quader. Last updated 1 years ago.

25.5 match 2 stars 3.82 score 66 scripts

ralmond

mongo:Higher level interface to Mongo database

This is a wrapper for the jsonlite and mongolite packages which offers both an R6 object for managing the connection as well as some mechanisms for saving and restoring S4 objects to a Mongo database.

Maintained by Russell Almond. Last updated 10 months ago.

23.5 match 4.13 score 3 dependents

andriyprotsak5

UAHDataScienceSC:Learn Supervised Classification Methods Through Examples and Code

Supervised classification methods, which (if asked) can provide step-by-step explanations of the algorithms used, as described in PK Josephine et. al., (2021) <doi:10.59176/kjcs.v1i1.1259>; and datasets to test them on, which highlight the strengths and weaknesses of each technique.

Maintained by Andriy Protsak Protsak. Last updated 1 months ago.

32.0 match 3.00 score

pik-piam

mrremind:MadRat REMIND Input Data Package

The mrremind packages contains data preprocessing for the REMIND model.

Maintained by Lavinia Baumstark. Last updated 3 days ago.

15.2 match 4 stars 6.25 score 15 scripts 1 dependents

epiverse-trace

epiparameter:Classes and Helper Functions for Working with Epidemiological Parameters

Classes and helper functions for loading, extracting, converting, manipulating, plotting and aggregating epidemiological parameters for infectious diseases. Epidemiological parameters extracted from the literature are loaded from the 'epiparameterDB' R package.

Maintained by Joshua W. Lambert. Last updated 2 months ago.

data-access data-package epidemiology epiverse probability-distribution

9.6 match 33 stars 9.84 score 102 scripts 1 dependents

eikeluedeling

chillR:Statistical Methods for Phenology Analysis in Temperate Fruit Trees

The phenology of plants (i.e. the timing of their annual life phases) depends on climatic cues. For temperate trees and many other plants, spring phases, such as leaf emergence and flowering, have been found to result from the effects of both cool (chilling) conditions and heat. Fruit tree scientists (pomologists) have developed some metrics to quantify chilling and heat (e.g. see Luedeling (2012) <doi:10.1016/j.scienta.2012.07.011>). 'chillR' contains functions for processing temperature records into chilling (Chilling Hours, Utah Chill Units and Chill Portions) and heat units (Growing Degree Hours). Regarding chilling metrics, Chill Portions are often considered the most promising, but they are difficult to calculate. This package makes it easy. 'chillR' also contains procedures for conducting a PLS analysis relating phenological dates (e.g. bloom dates) to either mean temperatures or mean chill and heat accumulation rates, based on long-term weather and phenology records (Luedeling and Gassner (2012) <doi:10.1016/j.agrformet.2011.10.020>). As of version 0.65, it also includes functions for generating weather scenarios with a weather generator, for conducting climate change analyses for temperature-based climatic metrics and for plotting results from such analyses. Since version 0.70, 'chillR' contains a function for interpolating hourly temperature records.

Maintained by Eike Luedeling. Last updated 4 months ago.

cpp

15.4 match 3 stars 6.13 score 346 scripts 1 dependents

statismike

shiny.reglog:Optional Login and Registration Module System for ShinyApps

RegLog system provides a set of shiny modules to handle register procedure for your users, alongside with login, edit credentials and password reset functionality. It provides support for popular SQL databases and optionally googlesheet-based database for easy setup. For email sending it provides support for 'emayili' and 'gmailr' backends. Architecture makes customizing usability pretty straightforward. The authentication system created with shiny.reglog is designed to be optional: user don't need to be logged-in to access your application, but when logged-in the user data can be used to read from and write to relational databases.

Maintained by Michal Kosinski. Last updated 3 years ago.

googlesheet register-ui shiny-applications sqlite

14.5 match 14 stars 6.45 score 20 scripts

ohdsi

FeatureExtraction:Generating Features for a Cohort

An R interface for generating features for a cohort using data in the Common Data Model. Features can be constructed using default or custom made feature definitions. Furthermore it's possible to aggregate features and get the summary statistics.

Maintained by Ger Inberg. Last updated 5 months ago.

hades openjdk

9.1 match 62 stars 10.30 score 209 scripts 1 dependents

openjusticeok

ojodb:Analyze Data from the Open Justice Oklahoma Database

{ojodb} provides convenient functions to query court data from the Open Justice Oklahoma database.

Maintained by Brancen Gregory. Last updated 8 months ago.

courts justice oklahoma open-data

19.8 match 8 stars 4.74 score 69 scripts

psychbruce

ChineseNames:Chinese Name Database 1930-2008

A database of Chinese surnames and Chinese given names (1930-2008). This database contains nationwide frequency statistics of 1,806 Chinese surnames and 2,614 Chinese characters used in given names, covering about 1.2 billion Han Chinese population (96.8% of the Han Chinese household-registered population born from 1930 to 2008 and still alive in 2008). This package also contains a function for computing multiple features of Chinese surnames and Chinese given names for scientific research (e.g., name uniqueness, name gender, name valence, and name warmth/competence).

Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.

big-data chinese chinese-name chinese-names database name names

19.1 match 147 stars 4.87 score 6 scripts

bioc

customCMPdb:Customize and Query Compound Annotation Database

This package serves as a query interface for important community collections of small molecules, while also allowing users to include custom compound collections.

Maintained by Yuzhu Duan. Last updated 5 months ago.

software cheminformatics annotationhubsoftware

19.3 match 2 stars 4.78 score 4 scripts

eurostat

restatapi:Search and Retrieve Data from Eurostat Database

Eurostat is the statistical office of the European Union and provides high quality statistics for Europe. Large set of the data is disseminated through the Eurostat database (<https://ec.europa.eu/eurostat/web/main/data/database>). The tools are using the REST API with the Statistical Data and Metadata eXchange (SDMX) Web Services (<https://wikis.ec.europa.eu/pages/viewpage.action?pageId=44165555>) to search and download data from the Eurostat database using the SDMX standard.

Maintained by Mátyás Mészáros. Last updated 2 months ago.

database eurostat filter open-data retrieve-data sdmx

13.9 match 25 stars 6.61 score 146 scripts 1 dependents

frbcesab

forcis:An R Client to Access the FORCIS Database

Provides an interface to the FORCIS database (<https://zenodo.org/doi/10.5281/zenodo.7390791>) on global foraminifera distribution. This package allows to download and to handle FORCIS data. It is part of the FRB-CESAB working group FORCIS. <https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/forcis/>.

Maintained by Nicolas Casajus. Last updated 11 days ago.

15.9 match 4 stars 5.76 score 5 scripts

kbhoehn

dowser:B Cell Receptor Phylogenetics Toolkit

Provides a set of functions for inferring, visualizing, and analyzing B cell phylogenetic trees. Provides methods to 1) reconstruct unmutated ancestral sequences, 2) build B cell phylogenetic trees using multiple methods, 3) visualize trees with metadata at the tips, 4) reconstruct intermediate sequences, 5) detect biased ancestor-descendant relationships among metadata types Workflow examples available at documentation site (see URL). Citations: Hoehn et al (2022) <doi:10.1371/journal.pcbi.1009885>, Hoehn et al (2021) <doi:10.1101/2021.01.06.425648>.

Maintained by Kenneth Hoehn. Last updated 2 months ago.

13.4 match 6.81 score 84 scripts

welch-lab

cytosignal:What the Package Does (One Line, Title Case)

What the package does (one paragraph).

Maintained by Jialin Liu. Last updated 5 days ago.

openblas cpp

15.3 match 16 stars 5.95 score 6 scripts

wactbprot

R4CouchDB:A R Convenience Layer for CouchDB 2.0

Provides a collection of functions for basic database and document management operations such as add (cdbAddDoc()), get (cdbGetDoc()), list access (cdbGetList()) or delete (cdbDeleteDoc()). Every cdbFunction() gets and returns a list() containing the connection setup. Such a list (named 'cdb' in the documentation) can be generated by cdb <- cdbIni(). Moreover, cdb provides some functions respectively functionality e.g cdb$baseUrl() or cdb$getDocRev().

Maintained by Thomas Bock. Last updated 8 years ago.

couchdb database

20.2 match 31 stars 4.45 score 18 scripts

gschofl

reutils:Talk to the NCBI EUtils

An interface to NCBI databases such as PubMed, GenBank, or GEO powered by the Entrez Programming Utilities (EUtils). The nine EUtils provide programmatic access to the NCBI Entrez query and database system for searching and retrieving biological data.

Maintained by Gerhard Schöfl. Last updated 4 years ago.

12.9 match 22 stars 6.95 score 135 scripts 1 dependents

ropensci

rcites:R Interface to the Species+ Database

A programmatic interface to the Species+ <https://speciesplus.net/> database via the Species+/CITES Checklist API <https://api.speciesplus.net/>.

Maintained by Kevin Cazelles. Last updated 2 years ago.

api-client cites database endangered-species trade

13.7 match 14 stars 6.52 score 26 scripts

jokergoo

pkgndep:Analyze Dependency Heaviness of R Packages

A new metric named 'dependency heaviness' is proposed that measures the number of additional dependency packages that a parent package brings to its child package and are unique to the dependency packages imported by all other parents. The dependency heaviness analysis is visualized by a customized heatmap. The package is described in <doi:10.1093/bioinformatics/btac449>. We have also performed the dependency heaviness analysis on the CRAN/Bioconductor package ecosystem and the results are implemented as a web-based database which provides comprehensive tools for querying dependencies of individual R packages. The systematic analysis on the CRAN/Bioconductor ecosystem is described in <doi:10.1016/j.jss.2023.111610>. From 'pkgndep' version 2.0.0, the heaviness database includes snapshots of the CRAN/Bioconductor ecosystems for many old R versions.

Maintained by Zuguang Gu. Last updated 13 days ago.

12.9 match 47 stars 6.75 score 30 scripts

mrcieu

ieugwasr:Interface to the 'OpenGWAS' Database API

Interface to the 'OpenGWAS' database API <https://api.opengwas.io/api/>. Includes a wrapper to make generic calls to the API, plus convenience functions for specific queries.

Maintained by Gibran Hemani. Last updated 2 days ago.

8.1 match 89 stars 10.71 score 404 scripts 6 dependents

ropensci

ckanr:Client for the Comprehensive Knowledge Archive Network ('CKAN') API

Client for 'CKAN' API (<https://ckan.org/>). Includes interface to 'CKAN' 'APIs' for search, list, show for packages, organizations, and resources. In addition, provides an interface to the 'datastore' API.

Maintained by Francisco Alves. Last updated 2 years ago.

database open-data ckan api data dataset api-wrapper ckan-api

10.0 match 100 stars 8.67 score 448 scripts 4 dependents

comiseng

LearnSL:Learn Supervised Classification Methods Through Examples and Code

Supervised classification methods, which (if asked) can provide step-by-step explanations of the algorithms used, as described in PK Josephine et. al., (2021) <doi:10.59176/kjcs.v1i1.1259>; and datasets to test them on, which highlight the strengths and weaknesses of each technique.

Maintained by Víctor Amador Padilla. Last updated 1 years ago.

32.0 match 2.70 score 1 scripts

bioc

RImmPort:RImmPort: Enabling Ready-for-analysis Immunology Research Data

The RImmPort package simplifies access to ImmPort data for analysis in the R environment. It provides a standards-based interface to the ImmPort study data that is in a proprietary format.

Maintained by Zicheng Hu. Last updated 5 months ago.

biomedicalinformatics dataimport datarepresentation

19.8 match 4.33 score 27 scripts

ddediu

AdhereR:Adherence to Medications

Computation of adherence to medications from Electronic Health care Data and visualization of individual medication histories and adherence patterns. The package implements a set of S3 classes and functions consistent with current adherence guidelines and definitions. It allows the computation of different measures of adherence (as defined in the literature, but also several original ones), their publication-quality plotting, the estimation of event duration and time to initiation, the interactive exploration of patient medication history and the real-time estimation of adherence given various parameter settings. It scales from very small datasets stored in flat CSV files to very large databases and from single-thread processing on mid-range consumer laptops to parallel processing on large heterogeneous computing clusters. It exposes a standardized interface allowing it to be used from other programming languages and platforms, such as Python.

Maintained by Dan Dediu. Last updated 1 years ago.

adherence-to-medications electronic-healthcare-data hadoop medical-databases medication-histories python sql visualisation

12.2 match 28 stars 7.07 score 47 scripts 1 dependents

mikldk

DNAtools:Tools for Analysing Forensic Genetic DNA Data

Computationally efficient tools for comparing all pairs of profiles in a DNA database. The expectation and covariance of the summary statistic is implemented for fast computing. Routines for estimating proportions of close related individuals are available. The use of wildcards (also called F- designation) is implemented. Dedicated functions ease plotting the results. See Tvedebrink et al. (2012) <doi:10.1016/j.fsigen.2011.08.001>. Compute the distribution of the numbers of alleles in DNA mixtures. See Tvedebrink (2013) <doi:10.1016/j.fsigss.2013.10.142>.

Maintained by Mikkel Meyer Andersen. Last updated 2 years ago.

cpp

14.2 match 6.00 score 28 scripts

pecanproject

PEcAn.settings:PEcAn Settings package

Contains functions to read PEcAn settings files.

Maintained by David LeBauer. Last updated 1 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

8.5 match 216 stars 10.00 score 54 scripts 17 dependents

mcodrescu

octopus:A Database Management Tool

A database management tool built as a 'shiny' application. Connect to various databases to send queries, upload files, preview tables, and more.

Maintained by Marcus Codrescu. Last updated 12 months ago.

data-science database rshiny

17.9 match 11 stars 4.74 score 4 scripts

bioc

goProfiles:goProfiles: an R package for the statistical analysis of functional profiles

The package implements methods to compare lists of genes based on comparing the corresponding 'functional profiles'.

Maintained by Alex Sanchez. Last updated 5 months ago.

annotation go geneexpression genesetenrichment graphandnetwork microarray multiplecomparison pathways software

15.4 match 5.48 score 6 scripts 1 dependents

yihui

knitr:A General-Purpose Package for Dynamic Report Generation in R

Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.

Maintained by Yihui Xie. Last updated 1 days ago.

dynamic-documents knitr literate-programming rmarkdown sweave

3.5 match 2.4k stars 23.62 score 116k scripts 4.2k dependents

leef-uzh

LEEF.analysis:Access Functions, Tests and Basic Analysis of the RRD Data from the LEEF Project

Provides simple access functions to read data out of the sqlite RRD database. SQL queries can be configured in a yaml config file and used.

Maintained by Rainer M. Krug. Last updated 1 months ago.

34.0 match 2.44 score 23 scripts

gadenbuie

starwarsdb:Relational Data from the 'Star Wars' API for Learning and Teaching

Provides data about the 'Star Wars' movie franchise in a set of relational tables or as a complete 'DuckDB' database. All data was collected from the open source 'Star Wars' API <https://swapi.dev/>.

Maintained by Garrick Aden-Buie. Last updated 2 years ago.

dplyr duckdb-database learning relational-database sql star-wars-data teaching

17.4 match 37 stars 4.76 score 31 scripts

r-dbi

DBItest:Testing DBI Backends

A helper that tests DBI back ends for conformity to the interface.

Maintained by Kirill Müller. Last updated 3 months ago.

database testing

10.0 match 24 stars 8.18 score 11 scripts

darwin-eu

CodelistGenerator:Identify Relevant Clinical Codes and Evaluate Their Use

Generate a candidate code list for the Observational Medical Outcomes Partnership (OMOP) common data model based on string matching. For a given search strategy, a candidate code list will be returned.

Maintained by Edward Burn. Last updated 25 days ago.

8.3 match 13 stars 9.87 score 165 scripts 4 dependents

iembry

chem.databases:Collection of 3 Chemical Databases from Public Sources

Contains the Multi-Species Acute Toxicity Database (CAS & SMILES columns only) [United States (US) Department of Health and Human Services (DHHS) National Institutes of Health (NIH) National Cancer Institute (NCI), "Multi-Species Acute Toxicity Database", <https://cactus.nci.nih.gov/download/acute-toxicity-db/>] combined with the Toxic Substances Control Act (TSCA) Inventory [United States Environmental Protection Agency (US EPA), "Toxic Substances Control Act (TSCA) Chemical Substance Inventory", <https://www.epa.gov/tsca-inventory/how-access-tsca-inventory} and <https://cdxapps.epa.gov/oms-substance-registry-services/substance-list-details/169>] and the Agency for Toxic Substances and Disease Registry (ATSDR) Database [United States (US) Department of Health and Human Services (DHHS) Centers for Disease Control and Prevention (CDC)/Agency for Toxic Substances and Disease Registry (ATSDR), "Agency for Toxic Substances and Disease Registry (ATSDR) Database", <https://cdxapps.epa.gov/oms-substance-registry-services/substance-list-details/105>] in 2 data sets. One data set has a focus on the latter 2 databases and one data set focuses on the former database. Also contains the collection of chemical data from Wikipedia compiled in the US EPA CompTox Chemicals Dashboard [United States Environmental Protection Agency (US EPA) / Wikimedia Foundation, Inc. "CompTox Chemicals Dashboard v2.2.1", <https://comptox.epa.gov/dashboard/chemical-lists/WIKIPEDIA>].

Maintained by Irucka Embry. Last updated 1 years ago.

48.0 match 1.70 score

ropensci

rdataretriever:R Interface to the Data Retriever

Provides an R interface to the Data Retriever <https://retriever.readthedocs.io/en/latest/> via the Data Retriever's command line interface. The Data Retriever automates the tasks of finding, downloading, and cleaning public datasets, and then stores them in a local database.

Maintained by Henry Senyondo. Last updated 8 months ago.

data data-science database datasets science

10.5 match 46 stars 7.70 score 36 scripts

ngiangre

kidsides:Download, Cache, and Connect to 'KidSIDES'

Caches and then connects to a 'sqlite' database containing half a million pediatric drug safety signals. The database is part of a family of resources catalogued at <https://nsides.io>. The database contains 17 tables where the description table provides a map between the fields the field's details. The database was created by Nicholas Giangreco during his PhD thesis which you can read in Giangreco (2022) <doi:10.7916/d8-5d9b-6738>. The observations are from the Food and Drug Administration's Adverse Event Reporting System. Generalized additive models estimated drug effects across child development stages for the occurrence of an adverse event when exposed to a drug compared to other drugs. Read more at the methods detailed in Giangreco (2022) <doi:10.1016/j.medj.2022.06.001>.

Maintained by Nicholas Giangreco. Last updated 2 years ago.

database drug generalized-additive-models informatics pediatrics pharmacovigilance pkgdown safety

18.4 match 5 stars 4.40 score 5 scripts

gavinsimpson

analogue:Analogue and Weighted Averaging Methods for Palaeoecology

Fits Modern Analogue Technique and Weighted Averaging transfer function models for prediction of environmental data from species data, and related methods used in palaeoecology.

Maintained by Gavin L. Simpson. Last updated 6 months ago.

9.0 match 14 stars 8.96 score 185 scripts 4 dependents

kwb-r

qmra.db:Database Backend for Quantitative Microbiological Risk Assessment (QMRA) within AquaNES project

This package contains a MS ACCESS database with data required for performing Quantitative Microbiological Risk Assessment (QMRA). In addition it provides also R functions for exporting these data into .csv files or a single ZIP file (together with the MS Access database).

Maintained by Michael Rustler. Last updated 6 years ago.

backend mc-access-database project-aquanes qmra

26.7 match 3.00 score 2 scripts

shikokuchuo

mirai:Minimalist Async Evaluation Framework for R

Designed for simplicity, a 'mirai' evaluates an R expression asynchronously in a parallel process, locally or distributed over the network. The result is automatically available upon completion. Modern networking and concurrency, built on 'nanonext' and 'NNG' (Nanomsg Next Gen), ensures reliable and efficient scheduling over fast inter-process communications or TCP/IP secured by TLS. Distributed computing can launch remote resources via SSH or cluster managers. An inherently queued architecture handles many more tasks than available processes, and requires no storage on the file system. Innovative features include support for otherwise non-exportable reference objects, event-driven promises, and asynchronous parallel map.

Maintained by Charlie Gao. Last updated 2 days ago.

async asynchronous-tasks concurrency distributed-computing high-performance-computing parallel-computing

6.7 match 217 stars 11.94 score 130 scripts 7 dependents

stan-dev

posteriordb:R functionality for posteriordb

R functionality of easy handling of the posteriordb posteriors.

Maintained by Mans Magnusson. Last updated 1 years ago.

23.6 match 8 stars 3.37 score 59 scripts

mi2-warsaw

sejmRP:An Information About Deputies and Votings in Polish Diet from Seventh to Eighth Term of Office

Set of functions that access information about deputies and votings in Polish diet from webpage <http://www.sejm.gov.pl>. The package was developed as a result of an internship in MI2 Group - <http://mi2.mini.pw.edu.pl>, Faculty of Mathematics and Information Science, Warsaw University of Technology.

Maintained by Piotr Smuda. Last updated 8 years ago.

15.8 match 21 stars 5.04 score 35 scripts

bioc

msPurity:Automated Evaluation of Precursor Ion Purity for Mass Spectrometry Based Fragmentation in Metabolomics

msPurity R package was developed to: 1) Assess the spectral quality of fragmentation spectra by evaluating the "precursor ion purity". 2) Process fragmentation spectra. 3) Perform spectral matching. What is precursor ion purity? -What we call "Precursor ion purity" is a measure of the contribution of a selected precursor peak in an isolation window used for fragmentation. The simple calculation involves dividing the intensity of the selected precursor peak by the total intensity of the isolation window. When assessing MS/MS spectra this calculation is done before and after the MS/MS scan of interest and the purity is interpolated at the recorded time of the MS/MS acquisition. Additionally, isotopic peaks can be removed, low abundance peaks are removed that are thought to have limited contribution to the resulting MS/MS spectra and the isolation efficiency of the mass spectrometer can be used to normalise the intensities used for the calculation.

Maintained by Thomas N. Lawson. Last updated 5 months ago.

massspectrometry metabolomics software bioconductor-package dims fragmentation lc-ms lc-msms mass-spectrometry precursor-ion-purity

11.3 match 15 stars 7.03 score 44 scripts

dyfanjones

noctua:Connect to 'AWS Athena' using R 'AWS SDK' 'paws' ('DBI' Interface)

Designed to be compatible with the 'R' package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this the 'R' 'AWS' Software Development Kit ('SDK') 'paws' <https://github.com/paws-r/paws> is used as a driver.

Maintained by Dyfan Jones. Last updated 11 months ago.

athena aws database

10.5 match 46 stars 7.48 score 58 scripts

apache

adbcdrivermanager:'Arrow' Database Connectivity ('ADBC') Driver Manager

Provides a developer-facing interface to 'Arrow' Database Connectivity ('ADBC') for the purposes of driver development, driver testing, and building high-level database interfaces for users. 'ADBC' <https://arrow.apache.org/adbc/> is an API standard for database access libraries that uses 'Arrow' for result sets and query parameters.

Maintained by Dewey Dunnington. Last updated 1 days ago.

cpp

6.9 match 417 stars 11.44 score 73 scripts 6 dependents

ropensci

ramlegacy:Download and Read RAM Legacy Stock Assessment Database

Contains functions to download, cache and read in 'Excel' version of the RAM Legacy Stock Assessment Data Base, an online compilation of stock assessment results for commercially exploited marine populations from around the world. The database is named after Dr. Ransom A. Myers whose original stock-recruitment database, is no longer being updated. More information about the database can be found at <https://ramlegacy.org/>. Ricard, D., Minto, C., Jensen, O.P. and Baum, J.K. (2012) <doi:10.1111/j.1467-2979.2011.00435.x>.

Maintained by Kshitiz Gupta. Last updated 5 years ago.

fisheries marine-biology ramlegacy ropensci stock-assessment

15.3 match 5 stars 5.11 score 26 scripts

ropensci

ruODK:An R Client for the ODK Central API

Access and tidy up data from the 'ODK Central' API. 'ODK Central' is a clearinghouse for digitally captured data using ODK <https://docs.getodk.org/central-intro/>. It manages user accounts and permissions, stores form definitions, and allows data collection clients like 'ODK Collect' to connect to it for form download and submission upload. The 'ODK Central' API is documented at <https://docs.getodk.org/central-api/>.

Maintained by Florian W. Mayer. Last updated 4 months ago.

database open-data odk api data dataset odata odata-client odk-central opendatakit

10.0 match 42 stars 7.73 score 57 scripts 1 dependents

doserjef

rFIA:Estimation of Forest Variables using the FIA Database

The goal of 'rFIA' is to increase the accessibility and use of the United States Forest Services (USFS) Forest Inventory and Analysis (FIA) Database by providing a user-friendly, open source toolkit to easily query and analyze FIA Data. Designed to accommodate a wide range of potential user objectives, 'rFIA' simplifies the estimation of forest variables from the FIA Database and allows all R users (experts and newcomers alike) to unlock the flexibility inherent to the Enhanced FIA design. Specifically, 'rFIA' improves accessibility to the spatial-temporal estimation capacity of the FIA Database by producing space-time indexed summaries of forest variables within user-defined population boundaries. Direct integration with other popular R packages (e.g., 'dplyr', 'tidyr', and 'sf') facilitates efficient space-time query and data summary, and supports common data representations and API design. The package implements design-based estimation procedures outlined by Bechtold & Patterson (2005) <doi:10.2737/SRS-GTR-80>, and has been validated against estimates and sampling errors produced by FIA 'EVALIDator'. Current development is focused on the implementation of spatially-enabled model-assisted and model-based estimators to improve population, change, and ratio estimates.

Maintained by Jeffrey Doser. Last updated 8 days ago.

compute-estimates fia fia-database fia-datamart forest-inventory forest-variables inventories space-time spatial

13.0 match 49 stars 5.93 score

pecanproject

PEcAn.data.land:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Mike Dietze. Last updated 1 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

8.3 match 216 stars 9.32 score 19 scripts 10 dependents

bioc

sitadela:An R package for the easy provision of simple but complete tab-delimited genomic annotation from a variety of sources and organisms

Provides an interface to build a unified database of genomic annotations and their coordinates (gene, transcript and exon levels). It is aimed to be used when simple tab-delimited annotations (or simple GRanges objects) are required instead of the more complex annotation Bioconductor packages. Also useful when combinatorial annotation elements are reuired, such as RefSeq coordinates with Ensembl biotypes. Finally, it can download, construct and handle annotations with versioned genes and transcripts (where available, e.g. RefSeq and latest Ensembl). This is particularly useful in precision medicine applications where the latter must be reported.

Maintained by Panagiotis Moulos. Last updated 5 months ago.

software workflowstep rnaseq transcription sequencing transcriptomics biomedicalinformatics functionalgenomics systemsbiology alternativesplicing dataimport chipseq

16.8 match 4.60 score 2 scripts

vimc

orderly:Lightweight Reproducible Reporting

Order, create and store reports from R. By defining a lightweight interface around the inputs and outputs of an analysis, a lot of the repetitive work for reproducible research can be automated. We define a simple format for organising and describing work that facilitates collaborative reproducible research and acknowledges that all analyses are run multiple times over their lifespans.

Maintained by Rich FitzJohn. Last updated 2 years ago.

8.0 match 117 stars 9.63 score 94 scripts 4 dependents

tidymodels

tidypredict:Run Predictions Inside the Database

It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(), cubist(), and ctree() models.

Maintained by Emil Hvitfeldt. Last updated 3 months ago.

dbplyr dplyr purrr rlang

7.0 match 261 stars 11.03 score 241 scripts 2 dependents

bioc

mosdef:MOSt frequently used and useful Differential Expression Functions

This package provides functionality to run a number of tasks in the differential expression analysis workflow. This encompasses the most widely used steps, from running various enrichment analysis tools with a unified interface to creating plots and beautifying table components linking to external websites and databases. This streamlines the generation of comprehensive analysis reports.

Maintained by Federico Marini. Last updated 3 months ago.

geneexpression software transcription transcriptomics differentialexpression visualization reportwriting genesetenrichment go

12.5 match 6.12 score 4 dependents

sebkrantz

collapse:Advanced and Fast Data Transformation

A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.

Maintained by Sebastian Krantz. Last updated 5 days ago.

data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data scientific-computing statistics time-series weighted weights cpp openmp

4.6 match 672 stars 16.63 score 708 scripts 97 dependents

kadyb

rgugik:Search and Retrieve Spatial Data from 'GUGiK'

Automatic open data acquisition from resources of Polish Head Office of Geodesy and Cartography ('Główny Urząd Geodezji i Kartografii') (<https://www.gov.pl/web/gugik>). Available datasets include various types of numeric, raster and vector data, such as orthophotomaps, digital elevation models (digital terrain models, digital surface model, point clouds), state register of borders, spatial databases, geometries of cadastral parcels, 3D models of buildings, and more. It is also possible to geocode addresses or objects using the geocodePL_get() function.

Maintained by Krzysztof Dyba. Last updated 6 days ago.

cartography geodesy gis open-data poland

9.9 match 34 stars 7.69 score 30 scripts

ohdsi

ResultModelManager:Result Model Manager

Database data model management utilities for R packages in the Observational Health Data Sciences and Informatics program <https://ohdsi.org>. 'ResultModelManager' provides utility functions to allow package maintainers to migrate existing SQL database models, export and import results in consistent patterns.

Maintained by Jamie Gilbert. Last updated 6 months ago.

openjdk

10.3 match 4 stars 7.38 score 9 scripts 3 dependents

dleutnant

influxdbr:R Interface to InfluxDB

An R interface to the InfluxDB time series database <https://www.influxdata.com>. This package allows you to fetch and write time series data from/to an InfluxDB server. Additionally, handy wrappers for the Influx Query Language (IQL) to manage and explore a remote database are provided.

Maintained by Dominik Leutnant. Last updated 5 years ago.

database influxdb tidyverse timeseries

12.8 match 94 stars 5.90 score 28 scripts

dyfanjones

RAthena:Connect to 'AWS Athena' using 'Boto3' ('DBI' Interface)

Designed to be compatible with the R package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this 'Python' 'Boto3' Software Development Kit ('SDK') <https://boto3.amazonaws.com/v1/documentation/api/latest/index.html> is used as a driver.

Maintained by Dyfan Jones. Last updated 1 years ago.

athena aws boto3 database

10.5 match 37 stars 7.10 score 38 scripts

keithmcnulty

neo4jshell:Querying and Managing 'Neo4J' Databases in 'R'

Sends queries to a specified 'Neo4J' graph database, capturing results in a dataframe where appropriate. Other useful functions for the importing and management of data on the 'Neo4J' server and basic local server admin.

Maintained by Keith McNulty. Last updated 3 years ago.

12.7 match 18 stars 5.85 score 13 scripts

wch

extrafont:Tools for Using Fonts

Tools to using fonts other than the standard PostScript fonts. This package makes it easy to use system TrueType fonts and with PDF or PostScript output files, and with bitmap output files in Windows. extrafont can also be used with fonts packaged specifically to be used with, such as the fontcm package, which has Computer Modern PostScript fonts with math symbols.

Maintained by Winston Chang. Last updated 2 years ago.

5.3 match 324 stars 14.10 score 13k scripts 51 dependents

nowosad

rcartocolor:'CARTOColors' Palettes

Provides color schemes for maps and other graphics designed by 'CARTO' as described at <https://carto.com/carto-colors/>. It includes four types of palettes: aggregation, diverging, qualitative, and quantitative.

Maintained by Jakub Nowosad. Last updated 5 months ago.

color-palette ggplot2 visualization

8.5 match 111 stars 8.64 score 1.4k scripts 1 dependents

bmaitner

BIEN:Tools for Accessing the Botanical Information and Ecology Network Database

Provides Tools for Accessing the Botanical Information and Ecology Network Database. The BIEN database contains cleaned and standardized botanical data including occurrence, trait, plot and taxonomic data (See <https://bien.nceas.ucsb.edu/bien/> for more Information). This package provides functions that query the BIEN database by constructing and executing optimized SQL queries.

Maintained by Brian Maitner. Last updated 1 months ago.

12.2 match 6.04 score 205 scripts 5 dependents

bioc

recountmethylation:Access and analyze public DNA methylation array data compilations

Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.

Maintained by Sean K Maden. Last updated 5 months ago.

dnamethylation epigenetics microarray methylationarray experimenthub

11.6 match 9 stars 6.28 score 9 scripts

bioc

brendaDb:The BRENDA Enzyme Database

R interface for importing and analyzing enzyme information from the BRENDA database.

Maintained by Yi Zhou. Last updated 5 months ago.

thirdpartyclient annotation dataimport brenda database enzyme hacktoberfest cpp

15.8 match 2 stars 4.60 score 4 scripts

geoffjentry

twitteR:R Based Twitter Client

Provides an interface to the Twitter web API.

Maintained by Jeff Gentry. Last updated 9 years ago.

7.1 match 254 stars 10.18 score 2.0k scripts 1 dependents

ohdsi

PhenotypeR:Assess Study Cohorts Using a Common Data Model

Phenotype study cohorts in data mapped to the Observational Medical Outcomes Partnership Common Data Model. Diagnostics are run at the database, code list, cohort, and population level to assess whether study cohorts are ready for research.

Maintained by Edward Burn. Last updated 4 days ago.

9.8 match 2 stars 7.40 score 57 scripts

rstudio

connections:Integrates with the 'RStudio' Connections Pane and 'pins'

Enables 'DBI' compliant packages to integrate with the 'RStudio' connections pane, and the 'pins' package. It automates the display of schemata, tables, views, as well as the preview of the table's top 1000 records.

Maintained by Edgar Ruiz. Last updated 1 years ago.

connection-pane database-connection pins rstudio

11.1 match 57 stars 6.50 score 124 scripts 1 dependents

wenjie1991

ggfigdone:Manage & Modify 'ggplot' Figures using 'ggfigdone'

When you prepare a presentation or a report, you often need to manage a large number of 'ggplot' figures. You need to change the figure size, modify the title, label, themes, etc. It is inconvenient to go back to the original code to make these changes. This package provides a simple way to manage 'ggplot' figures. You can easily add the figure to the database and update them later using CLI (command line interface) or GUI (graphical user interface).

Maintained by Wenjie SUN. Last updated 3 months ago.

ggplot2

13.9 match 29 stars 5.20 score 4 scripts

maxgreil

JamendoR:Access to 'Jamendo' API

Provides an interface to 'Jamendo' API <https://developer.jamendo.com/v3.0>. Pull audio, features and other information for a given 'Jamendo' user (including yourself!) or enter an artist's -, album's -, or track's name and retrieve the available information in seconds.

Maintained by Maximilian Greil. Last updated 1 years ago.

api-wrapper jamendo music

24.0 match 2 stars 3.00 score 1 scripts

karissawhiting

cbioportalR:Browse and Query Clinical and Genomic Data from cBioPortal

Provides R users with direct access to genomic and clinical data from the 'cBioPortal' web resource via user-friendly functions that wrap 'cBioPortal's' existing API endpoints <https://www.cbioportal.org/api/swagger-ui/index.html>. Users can browse and query genomic data on mutations, copy number alterations and fusions, as well as data on tumor mutational burden ('TMB'), microsatellite instability status ('MSI'), 'FACETS' and select clinical data points (depending on the study). See <https://www.cbioportal.org/> and Gao et al., (2013) <doi:10.1126/scisignal.2004088> for more information on the cBioPortal web resource.

Maintained by Karissa Whiting. Last updated 4 months ago.

10.6 match 21 stars 6.70 score 20 scripts

aidanmorales

rTwig:Realistic Quantitative Structure Models

Real Twig is a method to correct branch overestimation in quantitative structure models. Overestimated cylinders are correctly tapered using measured twig diameters of corresponding tree species. Supported quantitative structure modeling software includes 'TreeQSM', 'SimpleForest', 'Treegraph', and 'aRchi'. Also included is a novel database of twig diameters and tools for fractal analysis of point clouds.

Maintained by Aidan Morales. Last updated 12 days ago.

forestry lidar modeling qsm rcpp cpp

10.0 match 8 stars 7.10 score 13 scripts

dtkaplan

LSTbook:Data and Software for "Lessons in Statistical Thinking"

"Lessons in Statistical Thinking" D.T. Kaplan (2014) <https://dtkaplan.github.io/Lessons-in-statistical-thinking/> is a textbook for a first or second course in statistics that embraces data wrangling, causal reasoning, modeling, statistical adjustment, and simulation. 'LSTbook' supports the student-centered, tidy, pipeline-oriented computing style featured in the book.

Maintained by Daniel Kaplan. Last updated 10 hours ago.

11.3 match 4 stars 6.29 score 27 scripts