R-universe search: backend

tidyverse

dbplyr:A 'dplyr' Back End for Databases

A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.

Maintained by Hadley Wickham. Last updated 3 months ago.

database

38.0 match 481 stars 19.72 score 5.2k scripts 736 dependents

r-lib

keyring:Access the System Credential Store from R

Platform independent 'API' to access the operating system's credential store. Currently supports: 'Keychain' on 'macOS', Credential Store on 'Windows', the Secret Service 'API' on 'Linux', and simple, platform independent stores implemented with environment variables or encrypted files. Additional storage back-ends can be added easily.

Maintained by Gábor Csárdi. Last updated 15 days ago.

keyring security libsecret glib

17.9 match 198 stars 12.29 score 976 scripts 56 dependents

mihaiconstantin

parabar:Progress Bar for Parallel Tasks

A simple interface in the form of R6 classes for executing tasks in parallel, tracking their progress, and displaying accurate progress bars.

Maintained by Mihai Constantin. Last updated 3 months ago.

parallel-computing progress-bar

21.6 match 20 stars 7.56 score 20 scripts 5 dependents

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

13.4 match 10.82 score 10k scripts 54 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 4 days ago.

9.3 match 845 stars 13.57 score 264 scripts 2 dependents

dyfanjones

noctua:Connect to 'AWS Athena' using R 'AWS SDK' 'paws' ('DBI' Interface)

Designed to be compatible with the 'R' package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this the 'R' 'AWS' Software Development Kit ('SDK') 'paws' <https://github.com/paws-r/paws> is used as a driver.

Maintained by Dyfan Jones. Last updated 11 months ago.

athena aws database

15.0 match 46 stars 7.48 score 58 scripts

dyfanjones

RAthena:Connect to 'AWS Athena' using 'Boto3' ('DBI' Interface)

Designed to be compatible with the R package 'DBI' (Database Interface) when connecting to Amazon Web Service ('AWS') Athena <https://aws.amazon.com/athena/>. To do this 'Python' 'Boto3' Software Development Kit ('SDK') <https://boto3.amazonaws.com/v1/documentation/api/latest/index.html> is used as a driver.

Maintained by Dyfan Jones. Last updated 1 years ago.

athena aws boto3 database

15.0 match 37 stars 7.10 score 38 scripts

bioc

Spectra:Spectra Infrastructure for Mass Spectrometry Data

The Spectra package defines an efficient infrastructure for storing and handling mass spectrometry spectra and functionality to subset, process, visualize and compare spectra data. It provides different implementations (backends) to store mass spectrometry data. These comprise backends tuned for fast data access and processing and backends for very large data sets ensuring a small memory footprint.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 9 days ago.

infrastructure proteomics massspectrometry metabolomics bioconductor hacktoberfest mass-spectrometry

7.6 match 41 stars 13.01 score 254 scripts 35 dependents

r-dbi

DBItest:Testing DBI Backends

A helper that tests DBI back ends for conformity to the interface.

Maintained by Kirill Müller. Last updated 11 hours ago.

database testing

11.2 match 24 stars 8.20 score 11 scripts

renozao

doRNG:Generic Reproducible Parallel Backend for 'foreach' Loops

Provides functions to perform reproducible parallel foreach loops, using independent random streams as generated by L'Ecuyer's combined multiple-recursive generator [L'Ecuyer (1999), <DOI:10.1287/opre.47.1.159>]. It enables to easily convert standard '%dopar%' loops into fully reproducible loops, independently of the number of workers, the task scheduling strategy, or the chosen parallel environment and associated foreach backend.

Maintained by Renaud Gaujoux. Last updated 2 years ago.

7.1 match 20 stars 12.63 score 4.3k scripts 183 dependents

mlr-org

mlr3db:Data Base Backend for 'mlr3'

Extends the 'mlr3' package with a backend to transparently work with databases such as 'SQLite', 'DuckDB', 'MySQL', 'MariaDB', or 'PostgreSQL'. The package provides two additional backends: 'DataBackendDplyr' relies on the abstraction of package 'dbplyr' to interact with most DBMS. 'DataBackendDuckDB' operates on 'DuckDB' data bases and also on Apache Parquet files.

Maintained by Michel Lang. Last updated 1 years ago.

bigquery data-backend database duckdb machine-learning mariadb mlr3 mysql odbc postgresql spark sqlite

17.2 match 21 stars 4.77 score 17 scripts

ropensci

drake:A Pipeline Toolkit for Reproducible Computation at Scale

A general-purpose computational engine for data analysis, drake rebuilds intermediate data objects when their dependencies change, and it skips work when the results are already up to date. Not every execution starts from scratch, there is native support for parallel and distributed computing, and completed projects have tangible evidence that they are reproducible. Extensive documentation, from beginner-friendly tutorials to practical examples and more, is available at the reference website <https://docs.ropensci.org/drake/> and the online manual <https://books.ropensci.org/drake/>.

Maintained by William Michael Landau. Last updated 3 months ago.

data-science drake high-performance-computing makefile peer-reviewed pipeline reproducibility reproducible-research ropensci workflow

6.6 match 1.3k stars 11.49 score 1.7k scripts 1 dependents

doubleml

DoubleML:Double Machine Learning in R

Implementation of the double/debiased machine learning framework of Chernozhukov et al. (2018) <doi:10.1111/ectj.12097> for partially linear regression models, partially linear instrumental variable regression models, interactive regression models and interactive instrumental variable regression models. 'DoubleML' allows estimation of the nuisance parts in these models by machine learning methods and computation of the Neyman orthogonal score functions. 'DoubleML' is built on top of 'mlr3' and the 'mlr3' ecosystem. The object-oriented implementation of 'DoubleML' based on the 'R6' package is very flexible. More information available in the publication in the Journal of Statistical Software: <doi:10.18637/jss.v108.i03>.

Maintained by Philipp Bach. Last updated 4 months ago.

causal-inference data-science double-machine-learning econometrics machine-learning mlr3 statistics

8.2 match 137 stars 9.17 score 267 scripts 1 dependents

openanalytics

editbl:'DT' Extension for CRUD (Create, Read, Update, Delete) Applications in 'shiny'

The core of this package is a function eDT() which enhances DT::datatable() such that it can be used to interactively modify data in 'shiny'. By the use of generic 'dplyr' methods it supports many types of data storage, with relational databases ('dbplyr') being the main use case.

Maintained by Jasper Schelfhout. Last updated 2 months ago.

10.9 match 23 stars 6.52 score 12 scripts

bioc

DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets

Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation annotation genomeannotation bioconductor-package core-package u24ca289073

4.5 match 27 stars 15.59 score 538 scripts 1.2k dependents

bioc

ensembldb:Utilities to create and use Ensembl-based annotation databases

The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.

Maintained by Johannes Rainer. Last updated 5 months ago.

genetics annotationdata sequencing coverage annotation bioconductor bioconductor-packages ensembl

5.0 match 35 stars 14.08 score 892 scripts 108 dependents

r-dbi

DBI:R Database Interface

A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.

Maintained by Kirill Müller. Last updated 3 months ago.

database interface

3.3 match 302 stars 20.88 score 19k scripts 2.9k dependents

revolutionanalytics

foreach:Provides Foreach Looping Construct

Support for the foreach looping construct. Foreach is an idiom that allows for iterating over elements in a collection, without the use of an explicit loop counter. This package in particular is intended to be used for its return value, rather than for its side effects. In that sense, it is similar to the standard lapply function, but doesn't require the evaluation of a function. Using foreach without side effects also facilitates executing the loop in parallel.

Maintained by Folashade Daniel. Last updated 3 years ago.

foreach parallel-computing

3.6 match 54 stars 17.16 score 43k scripts 2.8k dependents

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 10 days ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

3.8 match 959 stars 15.16 score 4.0k scripts 21 dependents

bioc

TileDBArray:Using TileDB as a DelayedArray Backend

Implements a DelayedArray backend for reading and writing dense or sparse arrays in the TileDB format. The resulting TileDBArrays are compatible with all Bioconductor pipelines that can accept DelayedArray instances.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation infrastructure software

8.3 match 10 stars 6.89 score 26 scripts 1 dependents

rexyai

RestRserve:A Framework for Building HTTP API

Allows to easily create high-performance full featured HTTP APIs from R functions. Provides high-level classes such as 'Request', 'Response', 'Application', 'Middleware' in order to streamline server side application development. Out of the box allows to serve requests using 'Rserve' package, but flexible enough to integrate with other HTTP servers such as 'httpuv'.

Maintained by Dmitry Selivanov. Last updated 4 days ago.

http-server openapi rest-api swagger-ui cpp

5.6 match 283 stars 9.56 score 95 scripts 1 dependents

ropensci

taxadb:A High-Performance Local Taxonomic Database Interface

Creates a local database of many commonly used taxonomic authorities and provides functions that can quickly query this data.

Maintained by Carl Boettiger. Last updated 11 months ago.

7.0 match 43 stars 7.68 score 53 scripts 1 dependents

bioc

MsBackendRawFileReader:Mass Spectrometry Backend for Reading Thermo Fisher Scientific raw Files

implements a MsBackend for the Spectra package using Thermo Fisher Scientific's NewRawFileReader .Net libraries. The package is generalizing the functionality introduced by the rawrr package Methods defined in this package are supposed to extend the Spectra Bioconductor package.

Maintained by Christian Panse. Last updated 4 months ago.

massspectrometry proteomics metabolomics

7.9 match 5 stars 5.94 score 5 scripts

taylor-arnold

cleanNLP:A Tidy Data Model for Natural Language Processing

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, or a Python back ends with 'spaCy' <https://spacy.io>. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, and dependency parsing.

Maintained by Taylor B. Arnold. Last updated 10 months ago.

corenlp natural-language-processing spacy

5.5 match 214 stars 8.39 score 229 scripts

bioc

MsBackendSql:SQL-based Mass Spectrometry Data Backend

SQL-based mass spectrometry (MS) data backend supporting also storange and handling of very large data sets. Objects from this package are supposed to be used with the Spectra Bioconductor package. Through the MsBackendSql with its minimal memory footprint, this package thus provides an alternative MS data representation for very large or remote MS data sets.

Maintained by Johannes Rainer. Last updated 2 months ago.

infrastructure massspectrometry metabolomics dataimport proteomics

8.2 match 4 stars 5.46 score 16 scripts

bioc

Cardinal:A mass spectrometry imaging toolbox for statistical analysis

Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.

Maintained by Kylie Ariel Bemis. Last updated 3 months ago.

software infrastructure proteomics lipidomics massspectrometry imagingmassspectrometry immunooncology normalization clustering classification regression

4.2 match 47 stars 10.34 score 200 scripts

bioc

MsBackendMassbank:Mass Spectrometry Data Backend for MassBank record Files

Mass spectrometry (MS) data backend supporting import and export of MS/MS library spectra from MassBank record files. Different backends are available that allow handling of data in plain MassBank text file format or allow also to interact directly with MassBank SQL databases. Objects from this package are supposed to be used with the Spectra Bioconductor package. This package thus adds MassBank support to the Spectra package.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 18 days ago.

infrastructure massspectrometry metabolomics dataimport massbank spectra

7.2 match 3 stars 5.91 score 27 scripts

kwb-r

qmra.db:Database Backend for Quantitative Microbiological Risk Assessment (QMRA) within AquaNES project

This package contains a MS ACCESS database with data required for performing Quantitative Microbiological Risk Assessment (QMRA). In addition it provides also R functions for exporting these data into .csv files or a single ZIP file (together with the MS Access database).

Maintained by Michael Rustler. Last updated 6 years ago.

backend mc-access-database project-aquanes qmra

13.8 match 3.00 score 2 scripts

rpolars

polarssql:A 'polars' backend for 'DBI'.

DBI-compliant interface to 'polars'.

Maintained by Tatsuya Shima. Last updated 7 months ago.

dplyr dplyr-sql-backends polars

12.0 match 26 stars 3.41 score 3 scripts

imsmwu

RClickhouse:'Yandex Clickhouse' Interface for R with Basic 'dplyr' Support

'Yandex Clickhouse' (<https://clickhouse.com/>) is a high-performance relational column-store database to enable big data exploration and 'analytics' scaling to petabytes of data. Methods are provided that enable working with 'Yandex Clickhouse' databases via 'DBI' methods and using 'dplyr'/'dbplyr' idioms.

Maintained by Christian Hotz-Behofsits. Last updated 18 days ago.

clickhouse clickhouse-database dbi-interface dplyr dplyr-sql-backends cpp

6.7 match 94 stars 5.98 score 11 scripts

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

2.6 match 71 stars 14.95 score 670 scripts 127 dependents

ianmcook

implyr:R Interface for Apache Impala

'SQL' back-end to 'dplyr' for Apache Impala, the massively parallel processing query engine for Apache 'Hadoop'. Impala enables low-latency 'SQL' queries on data stored in the 'Hadoop' Distributed File System '(HDFS)', Apache 'HBase', Apache 'Kudu', Amazon Simple Storage Service '(S3)', Microsoft Azure Data Lake Store '(ADLS)', and Dell 'EMC' 'Isilon'. See <https://impala.apache.org> for more information about Impala.

Maintained by Ian Cook. Last updated 1 years ago.

apache dplyr dplyr-sql-backends hadoop impala jdbc odbc sql tidyverse

6.7 match 81 stars 5.71 score 42 scripts

rformassspectrometry

Chromatograms:Infrastructure for Chromatographic Mass Spectrometry Data

The Chromatograms packages defines a efficient infrastructure for storing and handling of chromatographic mass spectrometry data. It provides different implementations of *backends* to store and represent the data. Such backends can be optimized for small memory footprint or fast data access/processing. A lazy evaluation queue and chunk-wise processing capabilities ensure efficient analysis of also very large data sets.

Maintained by Philippine Louail. Last updated 27 days ago.

infrastructure proteomics massspectrometry metabolomics

9.0 match 1 stars 4.15 score 3 scripts

mlr-org

mlr3:Machine Learning in R - Next Generation

Efficient, object-oriented programming on the building blocks of machine learning. Provides 'R6' objects for tasks, learners, resamplings, and measures. The package is geared towards scalability and larger datasets by supporting parallelization and out-of-memory data-backends like databases. While 'mlr3' focuses on the core computational operations, add-on packages provide additional functionality.

Maintained by Marc Becker. Last updated 5 days ago.

classification data-science machine-learning mlr3 regression

2.5 match 972 stars 14.86 score 2.3k scripts 35 dependents

bioc

MsBackendMsp:Mass Spectrometry Data Backend for NIST msp Files

Mass spectrometry (MS) data backend supporting import and handling of MS/MS spectra from NIST MSP Format (msp) files. Import of data from files with different MSP *flavours* is supported. Objects from this package add support for MSP files to Bioconductor's Spectra package. This package is thus not supposed to be used without the Spectra package that provides a complete infrastructure for MS data handling.

Maintained by Johannes Rainer. Last updated 2 months ago.

infrastructure proteomics massspectrometry metabolomics dataimport mass-spectrometry

5.2 match 5 stars 7.22 score 37 scripts 3 dependents

bioc

MsBackendMgf:Mass Spectrometry Data Backend for Mascot Generic Format (mgf) Files

Mass spectrometry (MS) data backend supporting import and export of MS/MS spectra data from Mascot Generic Format (mgf) files. Objects defined in this package are supposed to be used with the Spectra Bioconductor package. This package thus adds mgf file support to the Spectra package.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 2 months ago.

infrastructure proteomics massspectrometry metabolomics dataimport

5.1 match 5 stars 7.32 score 35 scripts 4 dependents

tidyverse

multidplyr:A Multi-Process 'dplyr' Backend

Partition a data frame across multiple worker processes to provide simple multicore parallelism.

Maintained by Hadley Wickham. Last updated 8 months ago.

dplyr multiprocess

3.1 match 644 stars 10.82 score 460 scripts 5 dependents

bioc

variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments

Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.

Maintained by Gabriel E. Hoffman. Last updated 2 months ago.

rnaseq geneexpression genesetenrichment differentialexpression batcheffect qualitycontrol regression epigenetics functionalgenomics transcriptomics normalization preprocessing microarray immunooncology software

2.8 match 7 stars 11.69 score 1.1k scripts 3 dependents

bioc

rhdf5client:Access HDF5 content from HDF Scalable Data Service

This package provides functionality for reading data from HDF Scalable Data Service from within R. The HSDSArray function bridges from HSDS to the user via the DelayedArray interface. Bioconductor manages an open HSDS instance graciously provided by John Readey of the HDF Group.

Maintained by Vincent Carey. Last updated 5 months ago.

dataimport software infrastructure

6.6 match 4.82 score 37 scripts 2 dependents

s-u

Cairo:R Graphics Device using Cairo Graphics Library for Creating High-Quality Bitmap (PNG, JPEG, TIFF), Vector (PDF, SVG, PostScript) and Display (X11 and Win32) Output

R graphics device using cairographics library that can be used to create high-quality vector (PDF, PostScript and SVG) and bitmap output (PNG,JPEG,TIFF), and high-quality rendering in displays (X11 and Win32). Since it uses the same back-end for all output, copying across formats is WYSIWYG. Files are created without the dependence on X11 or other external programs. This device supports alpha channel (semi-transparent drawing) and resulting images can contain transparent and semi-transparent regions. It is ideal for use in server environments (file output) and as a replacement for other devices that don't have Cairo's capabilities such as alpha support or anti-aliasing. Backends are modular such that any subset of backends is supported.

Maintained by Simon Urbanek. Last updated 7 months ago.

freetype cairo libx11 libjpeg-turbo harfbuzz icu tiff

2.5 match 14 stars 12.52 score 3.9k scripts 71 dependents

mllg

checkmate:Fast and Versatile Argument Checks

Tests and assertions to perform frequent argument checks. A substantial part of the package was written in C to minimize any worries about execution time overhead.

Maintained by Michel Lang. Last updated 8 months ago.

assertions testthat

1.9 match 276 stars 16.28 score 1.5k scripts 1.9k dependents

bioc

BiocParallel:Bioconductor facilities for parallel evaluation

This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.

Maintained by Martin Morgan. Last updated 26 days ago.

infrastructure bioconductor-package core-package u24ca289073 cpp

1.7 match 67 stars 17.40 score 7.3k scripts 1.1k dependents

kwb-r

kwb.qmra:QMRA (quantitative microbial risk assessment)

QMRA for water supply systems.

Maintained by Michael Rustler. Last updated 4 years ago.

project-aquanes project-demoware project-smartcontrol qmra qmra-webapp-backend-engine

6.3 match 4 stars 4.53 score 21 scripts

bioc

VariantExperiment:A RangedSummarizedExperiment Container for VCF/GDS Data with GDS Backend

VariantExperiment is a Bioconductor package for saving data in VCF/GDS format into RangedSummarizedExperiment object. The high-throughput genetic/genomic data are saved in GDSArray objects. The annotation data for features/samples are saved in DelayedDataFrame format with mono-dimensional GDSArray in each column. The on-disk representation of both assay data and annotation data achieves on-disk reading and processing and saves memory space significantly. The interface of RangedSummarizedExperiment data format enables easy and common manipulations for high-throughput genetic/genomic data with common SummarizedExperiment metaphor in R and Bioconductor.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation sequencing annotation genomeannotation genotypingarray

5.6 match 1 stars 5.00 score 2 scripts

bioc

flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.

This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.

Maintained by Greg Finak. Last updated 10 days ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation zlib openblas cpp

3.5 match 7.89 score 576 scripts 10 dependents

bioc

RVS:Computes estimates of the probability of related individuals sharing a rare variant

Rare Variant Sharing (RVS) implements tests of association and linkage between rare genetic variant genotypes and a dichotomous phenotype, e.g. a disease status, in family samples. The tests are based on probabilities of rare variant sharing by relatives under the null hypothesis of absence of linkage and association between the rare variants and the phenotype and apply to single variants or multiple variants in a region (e.g. gene-based test).

Maintained by Alexandre Bureau. Last updated 5 months ago.

immunooncology genetics genomewideassociation variantdetection exomeseq wholegenome

5.6 match 4.78 score 9 scripts

jeroen

curl:A Modern and Flexible Web Client for R

Bindings to 'libcurl' <https://curl.se/libcurl/> for performing fully configurable HTTP/FTP requests where responses can be processed in memory, on disk, or streaming via the callback or connection interfaces. Some knowledge of 'libcurl' is recommended; for a more-user-friendly web client see the 'httr2' package which builds on this package with http specific tools and logic.

Maintained by Jeroen Ooms. Last updated 23 days ago.

curl

1.3 match 224 stars 19.98 score 4.0k scripts 5.9k dependents

christophsax

seasonal:R Interface to X-13-ARIMA-SEATS

Easy-to-use interface to X-13-ARIMA-SEATS, the seasonal adjustment software by the US Census Bureau. It offers full access to almost all options and outputs of X-13, including X-11 and SEATS, automatic ARIMA model search, outlier detection and support for user defined holiday variables, such as Chinese New Year or Indian Diwali. A graphical user interface can be used through the 'seasonalview' package. Uses the X-13-binaries from the 'x13binary' package.

Maintained by Christoph Sax. Last updated 16 days ago.

seasonal-adjustment time-series

2.2 match 120 stars 12.03 score 1.1k scripts 8 dependents

duckdb

duckdb:DBI Package for the DuckDB Database Management System

The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and an R Database Interface (DBI) connector.

Maintained by Kirill Müller. Last updated 3 days ago.

database duckdb olap cpp

1.9 match 158 stars 13.79 score 1.7k scripts 46 dependents

glin

reactable:Interactive Data Tables for R

Interactive data tables for R, based on the 'React Table' JavaScript library. Provides an HTML widget that can be used in 'R Markdown' or 'Quarto' documents, 'Shiny' applications, or viewed from an R console.

Maintained by Greg Lin. Last updated 2 months ago.

htmlwidgets react shiny table

1.7 match 645 stars 14.52 score 3.3k scripts 151 dependents

epiforecasts

EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters

Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.

Maintained by Sebastian Funk. Last updated 25 days ago.

backcalculation covid-19 gaussian-processes open-source reproduction-number stan cpp

2.0 match 120 stars 11.88 score 210 scripts

bioc

mzR:parser for netCDF, mzXML and mzML and mzIdentML files (mass spectrometry data)

mzR provides a unified API to the common file formats and parsers available for mass spectrometry data. It comes with a subset of the proteowizard library for mzXML, mzML and mzIdentML. The netCDF reading code has previously been used in XCMS.

Maintained by Steffen Neumann. Last updated 1 months ago.

immunooncology infrastructure dataimport proteomics metabolomics massspectrometry zlib cpp

1.8 match 45 stars 12.77 score 204 scripts 44 dependents

tiledb-inc

tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays

The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.

Maintained by Isaiah Norton. Last updated 4 days ago.

array hdfs s3 storage-manager tiledb cpp

1.9 match 107 stars 11.96 score 306 scripts 4 dependents

rformassspectrometry

MsBackendTimsTof:Mass spectrometry Data Backend for Bruker TimsTOF Files

Mass spectrometry (MS) data backend supporting import and export of (ion mobility) MS data from Bruker TimsTOF files. The backend uses the opentimsr package which relies on the proprietory vendor C++ library for raw data access. The backend thus supports import of MS data from the original raw data files.

Maintained by Johannes Rainer. Last updated 1 years ago.

infrastructure massspectrometry metabolomics dataimport mass-spectrometry

5.8 match 7 stars 3.85 score

indrajeetpatil

ggstatsplot:'ggplot2' Based Plots with Statistical Details

Extension of 'ggplot2', 'ggstatsplot' creates graphics with details from statistical tests included in the plots themselves. It provides an easier syntax to generate information-rich plots for statistical analysis of continuous (violin plots, scatterplots, histograms, dot plots, dot-and-whisker plots) or categorical (pie and bar charts) data. Currently, it supports the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian versions of t-test/ANOVA, correlation analyses, contingency table analysis, meta-analysis, and regression analyses. References: Patil (2021) <doi:10.21105/joss.03236>.

Maintained by Indrajeet Patil. Last updated 20 days ago.

bayes-factors datascience dataviz effect-size ggplot-extension hypothesis-testing non-parametric-statistics regression-models statistical-analysis

1.5 match 2.1k stars 14.49 score 3.0k scripts 1 dependents

ssi-dk

diseasystore:Feature Stores for the 'diseasy' Framework

Simple feature stores and tools for creating personalised feature stores. 'diseasystore' powers feature stores which can automatically link and aggregate features to a given stratification level. These feature stores are automatically time-versioned (powered by the 'SCDB' package) and allows you to easily and dynamically compute features as part of your continuous integration.

Maintained by Rasmus Skytte Randløv. Last updated 16 days ago.

3.0 match 4 stars 6.96 score 16 scripts

bioc

h5vc:Managing alignment tallies using a hdf5 backend

This package contains functions to interact with tally data from NGS experiments that is stored in HDF5 files.

Maintained by Paul Theodor Pyl. Last updated 2 months ago.

curl bzip2 xz-utils zlib cpp

4.7 match 4.48 score 2 scripts

hadley

cubelyr:A Data Cube 'dplyr' Backend

An implementation of a data cube extracted out of 'dplyr' for backward compatibility.

Maintained by Hadley Wickham. Last updated 2 years ago.

3.1 match 38 stars 6.44 score 59 scripts 4 dependents

bioc

CompoundDb:Creating and Using (Chemical) Compound Annotation Databases

CompoundDb provides functionality to create and use (chemical) compound annotation databases from a variety of different sources such as LipidMaps, HMDB, ChEBI or MassBank. The database format allows to store in addition MS/MS spectra along with compound information. The package provides also a backend for Bioconductor's Spectra package and allows thus to match experimetal MS/MS spectra against MS/MS spectra in the database. Databases can be stored in SQLite format and are thus portable.

Maintained by Johannes Rainer. Last updated 2 months ago.

massspectrometry metabolomics annotation databases mass-spectrometry

2.3 match 17 stars 8.40 score 69 scripts 1 dependents

nx10

unigd:Universal Graphics Device

A unified R graphics backend. Render R graphics fast and easy to many common file formats. Provides a thread safe 'C' interface for asynchronous rendering of R graphics.

Maintained by Florian Rupprecht. Last updated 1 days ago.

cairo tiff libpng zlib cpp

2.4 match 23 stars 8.07 score 6 scripts 2 dependents

bioc

MouseFM:In-silico methods for genetic finemapping in inbred mice

This package provides methods for genetic finemapping in inbred mice by taking advantage of their very high homozygosity rate (>95%).

Maintained by Matthias Munz. Last updated 5 months ago.

genetics snp genetarget variantannotation genomicvariation multiplecomparison systemsbiology mathematicalbiology patternlogic geneprediction biomedicalinformatics functionalgenomics finemap gene-candidates inbred-mice inbred-strains mouse qtl qtl-mapping

3.8 match 5.13 score 5 scripts

bioc

ProtGenerics:Generic infrastructure for Bioconductor mass spectrometry packages

S4 generic functions and classes needed by Bioconductor proteomics packages.

Maintained by Laurent Gatto. Last updated 2 months ago.

infrastructure proteomics massspectrometry bioconductor mass-spectrometry metabolomics

2.0 match 8 stars 9.43 score 4 scripts 188 dependents

ropensci

gittargets:Data Version Control for the Targets Package

In computationally demanding data analysis pipelines, the 'targets' R package (2021, <doi:10.21105/joss.02959>) maintains an up-to-date set of results while skipping tasks that do not need to rerun. This process increases speed and increases trust in the final end product. However, it also overwrites old output with new output, and past results disappear by default. To preserve historical output, the 'gittargets' package captures version-controlled snapshots of the data store, and each snapshot links to the underlying commit of the source code. That way, when the user rolls back the code to a previous branch or commit, 'gittargets' can recover the data contemporaneous with that commit so that all targets remain up to date.

Maintained by William Michael Landau. Last updated 8 months ago.

data-science data-version-control data-versioning reproducibility reproducible-research targets workflow

3.1 match 88 stars 5.99 score 11 scripts

bioc

DFplyr:A `DataFrame` (`S4Vectors`) backend for `dplyr`

Provides `dplyr` verbs (`mutate`, `select`, `filter`, etc...) supporting `S4Vectors::DataFrame` objects. Importantly, this is achieved without conversion to an intermediate `tibble`. Adds grouping infrastructure to `DataFrame` which is respected by the transformation verbs.

Maintained by Jonathan Carroll. Last updated 5 months ago.

datarepresentation infrastructure software

3.1 match 21 stars 5.87 score 5 scripts

geoffjentry

twitteR:R Based Twitter Client

Provides an interface to the Twitter web API.

Maintained by Jeff Gentry. Last updated 9 years ago.

1.8 match 254 stars 10.18 score 2.0k scripts 1 dependents

appsilon

shiny.telemetry:'Shiny' App Usage Telemetry

Enables instrumentation of 'Shiny' apps for tracking user session events such as input changes, browser type, and session duration. These events can be sent to any of the available storage backends and analyzed using the included 'Shiny' app to gain insights about app usage and adoption.

Maintained by André Veríssimo. Last updated 3 months ago.

analytics rhinoverse shiny

1.8 match 67 stars 9.69 score 29 scripts

bioc

matter:Out-of-core statistical computing and signal processing

Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.

Maintained by Kylie A. Bemis. Last updated 3 months ago.

infrastructure datarepresentation dataimport dimensionreduction preprocessing cpp

1.7 match 57 stars 9.52 score 64 scripts 2 dependents

azure

azuremlsdk:Interface to the 'Azure Machine Learning' 'SDK'

Interface to the 'Azure Machine Learning' Software Development Kit ('SDK'). Data scientists can use the 'SDK' to train, deploy, automate, and manage machine learning models on the 'Azure Machine Learning' service. To learn more about 'Azure Machine Learning' visit the website: <https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml>.

Maintained by Diondra Peck. Last updated 3 years ago.

amlcompute azure azure-machine-learning azureml dsi machine-learning rstudio sdk-r

1.8 match 106 stars 8.91 score 221 scripts

richfitz

storr:Simple Key Value Stores

Creates and manages simple key-value stores. These can use a variety of approaches for storing the data. This package implements the base methods and support for file system, in-memory and DBI-based database stores.

Maintained by Rich FitzJohn. Last updated 4 years ago.

1.5 match 117 stars 10.21 score 57 scripts 33 dependents

mlr-org

mlr3torch:Deep Learning with 'mlr3'

Deep Learning library that extends the mlr3 framework by building upon the 'torch' package. It allows to conveniently build, train, and evaluate deep learning models without having to worry about low level details. Custom architectures can be created using the graph language defined in 'mlr3pipelines'.

Maintained by Sebastian Fischer. Last updated 1 months ago.

data-science deep-learning machine-learning mlr3 torch

2.0 match 42 stars 7.63 score 78 scripts

assaforon

cir:Centered Isotonic Regression and Dose-Response Utilities

Isotonic regression (IR) and its improvement: centered isotonic regression (CIR). CIR is recommended in particular with small samples. Also, interval estimates for both, and additional utilities such as plotting dose-response data. For dev version and change history, see GitHub assaforon/cir.

Maintained by Assaf P. Oron. Last updated 1 months ago.

3.3 match 4.45 score 19 scripts 1 dependents

distancedevelopment

mrds:Mark-Recapture Distance Sampling

Animal abundance estimation via conventional, multiple covariate and mark-recapture distance sampling (CDS/MCDS/MRDS). Detection function fitting is performed via maximum likelihood. Also included are diagnostics and plotting for fitted detection functions. Abundance estimation is via a Horvitz-Thompson-like estimator.

Maintained by Laura Marshall. Last updated 2 months ago.

1.8 match 4 stars 8.05 score 78 scripts 7 dependents

predictiveecology

reproducible:Enhance Reproducibility of R Code

A collection of high-level, machine- and OS-independent tools for making reproducible and reusable content in R. The two workhorse functions are Cache() and prepInputs(). Cache() allows for nested caching, is robust to environments and objects with environments (like functions), and deals with some classes of file-backed R objects e.g., from terra and raster packages. Both functions have been developed to be foundational components of data retrieval and processing in continuous workflow situations. In both functions, efforts are made to make the first and subsequent calls of functions have the same result, but faster at subsequent times by way of checksums and digesting. Several features are still under development, including cloud storage of cached objects allowing for sharing between users. Several advanced options are available, see ?reproducibleOptions().

Maintained by Eliot J B McIntire. Last updated 1 months ago.

reproducibility reproducible-research

1.3 match 41 stars 10.52 score 122 scripts 15 dependents

mschubert

clustermq:Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)

Evaluate arbitrary function calls using workers on HPC schedulers in single line of code. All processing is done on the network without accessing the file system. Remote schedulers are supported via SSH.

Maintained by Michael Schubert. Last updated 25 days ago.

cluster high-performance-computing lsf sge slurm ssh zeromq3 cpp

1.3 match 149 stars 10.23 score 253 scripts

r-simmer

simmer.mon:Monitoring Backends for 'simmer'

Provides additional monitoring backends for 'simmer', e.g., the ability to connect the simulation output to a database.

Maintained by Iñaki Ucar. Last updated 2 years ago.

cpp

5.7 match 5 stars 2.40 score 2 scripts

clewerenz

ilabelled:Simple Handling of Labelled Data

Simple handling of survey data. Smart handling of meta-information like e.g. variable-labels value-labels and scale-levels. Easy access and validation of meta-information. Useage of value labels and values respectively for subsetting and recoding data.

Maintained by Christof Lewerenz. Last updated 2 months ago.

2.3 match 2 stars 6.02 score 13 scripts

ambiorix-web

eburones:User Session for 'Ambiorix'

Track user sessions in 'Ambiorix' applications.

Maintained by John Coene. Last updated 3 years ago.

ambiorix session-management

6.8 match 2 stars 2.00 score

nathaneastwood

poorman:A Poor Man's Dependency Free Recreation of 'dplyr'

A replication of key functionality from 'dplyr' and the wider 'tidyverse' using only 'base'.

Maintained by Nathan Eastwood. Last updated 1 years ago.

base-r data-manipulation grammar

1.3 match 341 stars 10.79 score 156 scripts 27 dependents

bioc

VCFArray:Representing on-disk / remote VCF files as array-like objects

VCFArray extends the DelayedArray to represent VCF data entries as array-like objects with on-disk / remote VCF file as backend. Data entries from VCF files, including info fields, FORMAT fields, and the fixed columns (REF, ALT, QUAL, FILTER) could be converted into VCFArray instances with different dimensions.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation sequencing variantannotation

3.4 match 1 stars 4.00 score 3 scripts

open-eo

openeo:Client Interface for 'openEO' Servers

Access data and processing functionalities of 'openEO' compliant back-ends in R.

Maintained by Florian Lahn. Last updated 2 months ago.

openeo openeo-user

1.5 match 64 stars 8.65 score 128 scripts

mihaiconstantin

doParabar:'foreach' Parallel Adapter for 'parabar' Backends

Provides a 'foreach' parallel adapter for 'parabar' backends. This package offers a minimal implementation of the '%dopar%' operator, enabling users to run 'foreach' loops in parallel, leveraging the parallel and progress-tracking capabilities of the 'parabar' package. Learn more about 'parabar' and 'doParabar' at <https://parabar.mihaiconstantin.com>.

Maintained by Mihai Constantin. Last updated 2 months ago.

foreach parallel-computing

3.5 match 1 stars 3.65 score 5 scripts 1 dependents

merck

simtrial:Clinical Trial Simulation

Provides some basic routines for simulating a clinical trial. The primary intent is to provide some tools to generate trial simulations for trials with time to event outcomes. Piecewise exponential failure rates and piecewise constant enrollment rates are the underlying mechanism used to simulate a broad range of scenarios such as those presented in Lin et al. (2020) <doi:10.1080/19466315.2019.1697738>. However, the basic generation of data is done using pipes to allow maximum flexibility for users to meet different needs.

Maintained by Yujie Zhao. Last updated 3 days ago.

cpp

1.3 match 21 stars 9.16 score 52 scripts

prestodb

RPresto:DBI Connector to Presto

Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: <https://prestodb.io/>.

Maintained by Jarod G.R. Meng. Last updated 1 months ago.

1.3 match 132 stars 9.73 score 25 scripts 4 dependents

enchufa2

flexiblas:'FlexiBLAS' API Interface

Provides functions to switch the 'BLAS'/'LAPACK' optimized backend and change the number of threads without leaving the R session, which needs to be linked against the 'FlexiBLAS' wrapper library <https://www.mpi-magdeburg.mpg.de/projects/flexiblas>.

Maintained by Iñaki Ucar. Last updated 1 years ago.

2.8 match 14 stars 4.35 score 16 scripts

bioc

gypsum:Interface to the gypsum REST API

Client for the gypsum REST API (https://gypsum.artifactdb.com), a cloud-based file store in the ArtifactDB ecosystem. This package provides functions for uploads, downloads, and various adminstrative and management tasks. Check out the documentation at https://github.com/ArtifactDB/gypsum-worker for more details.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport

1.9 match 1 stars 6.32 score 20 scripts 2 dependents

mohuangx

SAVER:Single-Cell RNA-Seq Gene Expression Recovery

An implementation of a regularized regression prediction and empirical Bayes method to recover the true gene expression profile in noisy and sparse single-cell RNA-seq data. See Huang M, et al (2018) <doi:10.1038/s41592-018-0033-z> for more details.

Maintained by Mo Huang. Last updated 4 months ago.

1.3 match 110 stars 8.88 score 231 scripts 2 dependents

ropensci

c14bazAAR:Download and Prepare C14 Dates from Different Source Databases

Query different C14 date databases and apply basic data cleaning, merging and calibration steps. Currently available databases: 14cpalaeolithic, 14sea, adrac, agrichange, aida, austarch, bda, calpal, caribbean, eubar, euroevol, irdd, jomon, katsianis, kiteeastafrica, medafricarbon, mesorad, neonet, neonetatl, nerd, p3k14c, pacea, palmisano, rado.nb, rxpand, sard, xronos.

Maintained by Clemens Schmid. Last updated 1 months ago.

archaeology radiocarbon-dates

1.9 match 31 stars 6.01 score 47 scripts

mmedl94

lionfish:Interactive 'tourr' Using 'python'

Extends the functionality of the 'tourr' package by an interactive graphical user interface. The interactivity allows users to effortlessly refine their 'tourr' results by manual intervention, which allows for integration of expert knowledge and aids the interpretation of results. For more information on 'tourr' see Wickham et. al (2011) <doi:10.18637/jss.v040.i02> or <https://github.com/ggobi/tourr>.

Maintained by Matthias Medl. Last updated 4 days ago.

data-sience data-visualization dimensionality-reduction exploratory-data-analysis interactive interactive-visualizations tourr

1.9 match 1 stars 5.96 score

jpfitzinger

tidyfit:Regularized Linear Modeling with Tidy Data

An extension to the 'R' tidy data environment for automated machine learning. The package allows fitting and cross validation of linear regression and classification algorithms on grouped data.

Maintained by Johann Pfitzinger. Last updated 2 months ago.

auto-ml classification machine-learning regression tidyverse

1.5 match 16 stars 7.22 score 26 scripts

dynverse

babelwhale:Talking to 'Docker' and 'Singularity' Containers

Provides a unified interface to interact with 'docker' and 'singularity' containers. You can execute a command inside a container, mount a volume or copy a file.

Maintained by Robrecht Cannoodt. Last updated 2 years ago.

2.0 match 24 stars 5.33 score 15 scripts 2 dependents

edjnet

tidywikidatar:Explore 'Wikidata' Through Tidy Data Frames

Query 'Wikidata' API <https://www.wikidata.org/wiki/Wikidata:Main_Page> with ease, get tidy data frames in response, and cache data in a local database.

Maintained by Giorgio Comai. Last updated 8 months ago.

wikidata

1.3 match 26 stars 7.86 score 46 scripts 2 dependents

mrc-ide

orderly2:Orderly Next Generation

Distributed reproducible computing framework, adopting ideas from git, docker and other software. By defining a lightweight interface around the inputs and outputs of an analysis, a lot of the repetitive work for reproducible research can be automated. We define a simple format for organising and describing work that facilitates collaborative reproducible research and acknowledges that all analyses are run multiple times over their lifespans.

Maintained by Rich FitzJohn. Last updated 2 months ago.

1.3 match 8 stars 8.30 score 49 scripts 2 dependents

mclements

ascii:Export R Objects to Several Markup Languages

Coerce R object to 'asciidoc', 'txt2tags', 'restructuredText', 'org', 'textile' or 'pandoc' syntax. Package comes with a set of drivers for 'Sweave'.

Maintained by Mark Clements. Last updated 1 years ago.

1.9 match 8 stars 5.31 score 161 scripts 2 dependents

r-lib

tidyselect:Select from a Set of Strings

A backend for the selecting functions of the 'tidyverse'. It makes it easy to implement select-like functions in your own packages in a way that is consistent with other 'tidyverse' interfaces for selection.

Maintained by Lionel Henry. Last updated 4 months ago.

0.5 match 130 stars 18.31 score 1.9k scripts 8.2k dependents

ruthkr

deepredeff:Deep Learning Prediction of Effectors

A tool that contains trained deep learning models for predicting effector proteins. 'deepredeff' has been trained to identify effector proteins using a set of known experimentally validated effectors from either bacteria, fungi, or oomycetes. Documentation is available via several vignettes, and the paper by Kristianingsih and MacLean (2020) <doi:10.1101/2020.07.08.193250>.

Maintained by Ruth Kristianingsih. Last updated 2 years ago.

2.0 match 4 stars 4.86 score 18 scripts

r-tensorflow

autokeras:R Interface to 'AutoKeras'

R Interface to 'AutoKeras' <https://autokeras.com/>. 'AutoKeras' is an open source software library for Automated Machine Learning (AutoML). The ultimate goal of AutoML is to provide easily accessible deep learning tools to domain experts with limited data science or machine learning background. 'AutoKeras' provides functions to automatically search for architecture and hyperparameters of deep learning models.

Maintained by Juan Cruz Rodriguez. Last updated 4 years ago.

autodl automatic-machine-learning automl deep-learning keras machine-learning tensorflow

1.8 match 73 stars 5.34 score

markvanderloo

simputation:Simple Imputation

Easy to use interfaces to a number of imputation methods that fit in the not-a-pipe operator of the 'magrittr' package.

Maintained by Mark van der Loo. Last updated 8 months ago.

data-science imputation officialstatistics

1.1 match 92 stars 8.38 score 350 scripts

bioc

spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.

Maintained by Jianhai Zhang. Last updated 4 months ago.

spatial visualization microarray sequencing geneexpression datarepresentation network clustering graphandnetwork cellbasedassays atacseq dnaseq tissuemicroarray singlecell cellbiology genetarget

1.5 match 5 stars 6.26 score 12 scripts

danforthcenter

pcvr:Plant Phenotyping and Bayesian Statistics

Analyse common types of plant phenotyping data, provide a simplified interface to longitudinal growth modeling and select Bayesian statistics, and streamline use of 'PlantCV' output. Several Bayesian methods and reporting guidelines for Bayesian methods are described in Kruschke (2018) <doi:10.1177/2515245918771304>, Kruschke (2013) <doi:10.1037/a0029146>, and Kruschke (2021) <doi:10.1038/s41562-021-01177-7>.

Maintained by Josh Sumner. Last updated 5 days ago.

1.3 match 4 stars 6.99 score 39 scripts

venelin

PCMBaseCpp:Fast Likelihood Calculation for Phylogenetic Comparative Models

Provides a C++ backend for multivariate phylogenetic comparative models implemented in the R-package 'PCMBase'. Can be used in combination with 'PCMBase' to enable fast and parallel likelihood calculation. Implements the pruning likelihood calculation algorithm described in Mitov et al. (2018) <arXiv:1809.09014>. Uses the 'SPLITT' C++ library for parallel tree traversal described in Mitov and Stadler (2018) <doi:10.1111/2041-210X.13136>.

Maintained by Venelin Mitov. Last updated 5 years ago.

openblas cpp

2.2 match 4.19 score 31 scripts

insightsengineering

autoslider.core:Slide Automation for Tables, Listings and Figures

The normal process of creating clinical study slides is that a statistician manually type in the numbers from outputs and a separate statistician to double check the typed in numbers. This process is time consuming, resource intensive, and error prone. Automatic slide generation is a solution to address these issues. It reduces the amount of work and the required time when creating slides, and reduces the risk of errors from manually typing or copying numbers from the output to slides. It also helps users to avoid unnecessary stress when creating large amounts of slide decks in a short time window.

Maintained by Joe Zhu. Last updated 2 months ago.

1.5 match 3 stars 5.91 score 3 scripts

tidyverse

dtplyr:Data Table Back-End for 'dplyr'

Provides a data.table backend for 'dplyr'. The goal of 'dtplyr' is to allow you to write 'dplyr' code that is automatically translated to the equivalent, but usually much faster, data.table code.

Maintained by Hadley Wickham. Last updated 2 months ago.

datatable dplyr

0.5 match 671 stars 16.27 score 2.5k scripts 147 dependents

rapporter

pander:An R 'Pandoc' Writer

Contains some functions catching all messages, 'stdout' and other useful information while evaluating R code and other helpers to return user specified text elements (like: header, paragraph, table, image, lists etc.) in 'pandoc' markdown or several type of R objects similarly automatically transformed to markdown format. Also capable of exporting/converting (the resulting) complex 'pandoc' documents to e.g. HTML, 'PDF', 'docx' or 'odt'. This latter reporting feature is supported in brew syntax or with a custom reference class with a smarty caching 'backend'.

Maintained by Gergely Daróczi. Last updated 16 days ago.

literate-programming markdown pandoc pandoc-markdown reproducible-research rmarkdown cpp

0.5 match 297 stars 16.60 score 7.6k scripts 108 dependents

ropengov

openthl:R Tools for THL Open Data API

R package for THL open data API.

Maintained by Tuomo Nieminen. Last updated 4 years ago.

2.2 match 3 stars 3.78 score

revolutionanalytics

doParallel:Foreach Parallel Adaptor for the 'parallel' Package

Provides a parallel backend for the %dopar% function using the parallel package.

Maintained by Folashade Daniel. Last updated 3 years ago.

0.6 match 5 stars 14.56 score 50k scripts 1.4k dependents

bioc

hiAnnotator:Functions for annotating GRanges objects

hiAnnotator contains set of functions which allow users to annotate a GRanges object with custom set of annotations. The basic philosophy of this package is to take two GRanges objects (query & subject) with common set of seqnames (i.e. chromosomes) and return associated annotation per seqnames and rows from the query matching seqnames and rows from the subject (i.e. genes or cpg islands). The package comes with three types of annotation functions which calculates if a position from query is: within a feature, near a feature, or count features in defined window sizes. Moreover, each function is equipped with parallel backend to utilize the foreach package. In addition, the package is equipped with wrapper functions, which finds appropriate columns needed to make a GRanges object from a common data frame.

Maintained by Nirav V Malani. Last updated 5 months ago.

software annotation

1.8 match 4.65 score 15 scripts 1 dependents

bioc

SCArray.sat:Large-scale single-cell RNA-seq data analysis using GDS files and Seurat

Extends the Seurat classes and functions to support Genomic Data Structure (GDS) files as a DelayedArray backend for data representation. It relies on the implementation of GDS-based DelayedMatrix in the SCArray package to represent single cell RNA-seq data. The common optimized algorithms leveraging GDS-based and single cell-specific DelayedMatrix (SC_GDSMatrix) are implemented in the SCArray package. SCArray.sat introduces a new SCArrayAssay class (derived from the Seurat Assay), which wraps raw counts, normalized expressions and scaled data matrix based on GDS-specific DelayedMatrix. It is designed to integrate seamlessly with the Seurat package to provide common data analysis in the SeuratObject-based workflow. Compared with Seurat, SCArray.sat significantly reduces the memory usage without downsampling and can be applied to very large datasets.

Maintained by Xiuwen Zheng. Last updated 5 months ago.

datarepresentation dataimport singlecell rnaseq

2.4 match 1 stars 3.40 score 3 scripts

hypertidy

scdb:Database Backend for Common Form Data

Creates a database backend from `silicate` data.

Maintained by Michael D. Sumner. Last updated 6 years ago.

3.6 match 3 stars 2.22 score 11 scripts

julien-hec

BKTR:Bayesian Kernelized Tensor Regression

Facilitates scalable spatiotemporally varying coefficient modelling with Bayesian kernelized tensor regression. The important features of this package are: (a) Enabling local temporal and spatial modeling of the relationship between the response variable and covariates. (b) Implementing the model described by Lei et al. (2023) <doi:10.48550/arXiv.2109.00046>. (c) Using a Bayesian Markov Chain Monte Carlo (MCMC) algorithm to sample from the posterior distribution of the model parameters. (d) Employing a tensor decomposition to reduce the number of estimated parameters. (e) Accelerating tensor operations and enabling graphics processing unit (GPU) acceleration with the 'torch' package.

Maintained by Julien Lanthier. Last updated 7 months ago.

1.8 match 2 stars 4.53 score 17 scripts

tony2015116

mintyr:Streamlined Data Processing Tools for Genomic Selection

A toolkit for genomic selection in animal breeding with emphasis on multi-breed and multi-trait nested grouping operations. Streamlines iterative analysis workflows when working with 'ASReml-R' package. Includes utility functions for phenotypic data processing commonly used by animal breeders.

Maintained by Guo Meng. Last updated 3 months ago.

asreml-r genomic-selection

1.7 match 4.48 score 2 scripts

bioc

SCArray:Large-scale single-cell omics data manipulation with GDS files

Provides large-scale single-cell omics data manipulation using Genomic Data Structure (GDS) files. It combines dense and sparse matrices stored in GDS files and the Bioconductor infrastructure framework (SingleCellExperiment and DelayedArray) to provide out-of-memory data storage and large-scale manipulation using the R programming language.

Maintained by Xiuwen Zheng. Last updated 4 months ago.

infrastructure datarepresentation dataimport singlecell rnaseq cpp

1.9 match 1 stars 4.02 score 9 scripts 1 dependents

easystats

datawizard:Easy Data Wrangling and Statistical Transformations

A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.

Maintained by Etienne Bacher. Last updated 10 days ago.

data dplyr hacktoberfest janitor manipulation reshape tidyr wrangling

0.5 match 222 stars 14.71 score 436 scripts 119 dependents

onnx

onnx:R Interface to 'ONNX'

R Interface to 'ONNX' - Open Neural Network Exchange <https://onnx.ai/>. 'ONNX' provides an open source format for machine learning models. It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types.

Maintained by Yuan Tang. Last updated 2 years ago.

deep-learning deep-neural-networks onnx

1.3 match 44 stars 5.90 score 18 scripts

psolymos

pbapply:Adding Progress Bar to '*apply' Functions

A lightweight package that adds progress bar to vectorized R functions ('*apply'). The implementation can easily be added to functions where showing the progress is useful (e.g. bootstrap). The type and style of the progress bar (with percentages or remaining time) can be set through options. Supports several parallel processing backends including future.

Maintained by Peter Solymos. Last updated 6 months ago.

progress-bar

0.5 match 157 stars 14.22 score 3.7k scripts 647 dependents

davzim

dataverifyr:A Lightweight, Flexible, and Fast Data Validation Package that Can Handle All Sizes of Data

Allows you to define rules which can be used to verify a given dataset. The package acts as a thin wrapper around more powerful data packages such as 'dplyr', 'data.table', 'arrow', and 'DBI' ('SQL'), which do the heavy lifting.

Maintained by David Zimmermann-Kollenda. Last updated 1 years ago.

verification

1.8 match 27 stars 4.13 score 7 scripts

bioc

HDF5Array:HDF5 datasets as array-like objects in R

The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.

Maintained by Hervé Pagès. Last updated 26 days ago.

infrastructure datarepresentation dataimport sequencing rnaseq coverage annotation genomeannotation singlecell immunooncology bioconductor-package core-package u24ca289073

0.5 match 12 stars 13.19 score 844 scripts 123 dependents

ggrothendieck

sqldf:Manipulate R Data Frames Using SQL

The sqldf() function is typically passed a single argument which is an SQL select statement where the table names are ordinary R data frame names. sqldf() transparently sets up a database, imports the data frames into that database, performs the SQL select or other statement and returns the result using a heuristic to determine which class to assign to each column of the returned data frame. The sqldf() or read.csv.sql() functions can also be used to read filtered files into R even if the original files are larger than R itself can handle. 'RSQLite', 'RH2', 'RMySQL' and 'RPostgreSQL' backends are supported.

Maintained by G. Grothendieck. Last updated 3 years ago.

0.5 match 250 stars 13.04 score 8.1k scripts 52 dependents

skeydan

torchaudio:R Interface to 'pytorch''s 'torchaudio'

Provides access to datasets, models and processing facilities for deep learning in audio.

Maintained by Sigrid Keydana. Last updated 2 years ago.

1.9 match 3.46 score 58 scripts

trevorld

xmpdf:Edit 'XMP' Metadata and 'PDF' Bookmarks and Documentation Info

Edit 'XMP' metadata <https://en.wikipedia.org/wiki/Extensible_Metadata_Platform> in a variety of media file formats as well as edit bookmarks (aka outline aka table of contents) and documentation info entries in 'pdf' files. Can detect and use a variety of command-line tools to perform these operations such as 'exiftool' <https://exiftool.org/>, 'ghostscript' <https://www.ghostscript.com/>, and/or 'pdftk' <https://gitlab.com/pdftk-java/pdftk>.

Maintained by Trevor L Davis. Last updated 12 months ago.

1.3 match 5 stars 5.18 score 1 scripts 1 dependents

vankesteren

tensorsem:Estimate structural equation models using computation graphs

Use lavaan code to create structural equation models, use torch to estimate them. This package provides the interface between lavaan and torch.

Maintained by Erik-Jan van Kesteren. Last updated 2 years ago.

computation-graph lavaan sem torch

1.8 match 52 stars 3.41 score 8 scripts

zajichek

cheese:Tools for Working with Data During Statistical Analysis

Contains tools for working with data during statistical analysis, promoting flexible, intuitive, and reproducible workflows. There are functions designated for specific statistical tasks such building a custom univariate descriptive table, computing pairwise association statistics, etc. These are built on a collection of data manipulation tools designed for general use that are motivated by the functional programming concept.

Maintained by Alex Zajichek. Last updated 2 years ago.

data-manipulation statistical-analysis

1.5 match 4.00 score 2 scripts

indrajeetpatil

statsExpressions:Tidy Dataframes and Expressions with Statistical Details

Utilities for producing dataframes with rich details for the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian t-test, one-way ANOVA, correlation analyses, contingency table analyses, and meta-analyses. The functions are pipe-friendly and provide a consistent syntax to work with tidy data. These dataframes additionally contain expressions with statistical details, and can be used in graphing packages. This package also forms the statistical processing backend for 'ggstatsplot'. References: Patil (2021) <doi:10.21105/joss.03236>.

Maintained by Indrajeet Patil. Last updated 20 days ago.

bayesian-inference bayesian-statistics contingency-table correlation effectsize meta-analysis parametric robust robust-statistics statistical-details statistical-tests tidy

0.5 match 312 stars 10.97 score 146 scripts 2 dependents

jkurle

robust2sls:Outlier Robust Two-Stage Least Squares Inference and Testing

An implementation of easy tools for outlier robust inference in two-stage least squares (2SLS) models. The user specifies a reference distribution against which observations are classified as outliers or not. After removing the outliers, adjusted standard errors are automatically provided. Furthermore, several statistical tests for the false outlier detection rate can be calculated. The outlier removing algorithm can be iterated a fixed number of times or until the procedure converges. The algorithms and robust inference are described in more detail in Jiao (2019) <https://drive.google.com/file/d/1qPxDJnLlzLqdk94X9wwVASptf1MPpI2w/view>.

Maintained by Jonas Kurle. Last updated 2 years ago.

1.3 match 1 stars 4.43 score 18 scripts

boehringer-ingelheim

flowml:A Backend for a 'nextflow' Pipeline that Performs Machine-Learning-Based Modeling of Biomedical Data

Provides functionality to perform machine-learning-based modeling in a computation pipeline. Its functions contain the basic steps of machine-learning-based knowledge discovery workflows, including model training and optimization, model evaluation, and model testing. To perform these tasks, the package builds heavily on existing machine-learning packages, such as 'caret' <https://github.com/topepo/caret/> and associated packages. The package can train multiple models, optimize model hyperparameters by performing a grid search or a random search, and evaluates model performance by different metrics. Models can be validated either on a test data set, or in case of a small sample size by k-fold cross validation or repeated bootstrapping. It also allows for 0-Hypotheses generation by performing permutation experiments. Additionally, it offers methods of model interpretation and item categorization to identify the most informative features from a high dimensional data space. The functions of this package can easily be integrated into computation pipelines (e.g. 'nextflow' <https://www.nextflow.io/>) and hereby improve scalability, standardization, and re-producibility in the context of machine-learning.

Maintained by Sebastian Malkusch. Last updated 1 years ago.

2.8 match 1 stars 2.00 score

hope-data-science

tidyfst:Tidy Verbs for Fast Data Manipulation

A toolkit of tidy data manipulation verbs with 'data.table' as the backend. Combining the merits of syntax elegance from 'dplyr' and computing performance from 'data.table', 'tidyfst' intends to provide users with state-of-the-art data manipulation tools with least pain. This package is an extension of 'data.table'. While enjoying a tidy syntax, it also wraps combinations of efficient functions to facilitate frequently-used data operations.

Maintained by Tian-Yuan Huang. Last updated 6 months ago.

0.5 match 98 stars 10.09 score 118 scripts 4 dependents

bioc

bluster:Clustering Algorithms for Bioconductor

Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology software geneexpression transcriptomics singlecell clustering cpp

0.5 match 9.43 score 636 scripts 51 dependents

leef-uzh

LEEF:Data Package Containing Only Data and Data Information

Setup package for the LEEF pipeline which loads / installs all necessary packages and functions to run the pipeline.

Maintained by Rainer M. Krug. Last updated 3 years ago.

data-analysis data-processing leef

1.5 match 2.95 score

beccadaniel

doSNOW:Foreach Parallel Adaptor for the 'snow' Package

Provides a parallel backend for the %dopar% function using the snow package of Tierney, Rossini, Li, and Sevcikova.

Maintained by Folashade Daniel. Last updated 3 years ago.

0.5 match 1 stars 7.88 score 2.6k scripts 98 dependents

uofuepibio

epiworldR:Fast Agent-Based Epi Models

A flexible framework for Agent-Based Models (ABM), the 'epiworldR' package provides methods for prototyping disease outbreaks and transmission models using a 'C++' backend, making it very fast. It supports multiple epidemiological models, including the Susceptible-Infected-Susceptible (SIS), Susceptible-Infected-Removed (SIR), Susceptible-Exposed-Infected-Removed (SEIR), and others, involving arbitrary mitigation policies and multiple-disease models. Users can specify infectiousness/susceptibility rates as a function of agents' features, providing great complexity for the model dynamics. Furthermore, 'epiworldR' is ideal for simulation studies featuring large populations.

Maintained by Andrew Pulsipher. Last updated 11 days ago.

abm agent-based-modeling covid-19 epidemics epidemiology r-programming rpack rpkg seir seir-model simulation sir sir-model cpp openmp

0.5 match 9 stars 8.33 score 58 scripts 1 dependents

beccadaniel

doMC:Foreach Parallel Adaptor for 'parallel'

Provides a parallel backend for the %dopar% function using the multicore functionality of the parallel package.

Maintained by Folashade Daniel. Last updated 3 years ago.

0.6 match 7.39 score 10k scripts 2 dependents

ropensci

emld:Ecological Metadata as Linked Data

This is a utility for transforming Ecological Metadata Language ('EML') files into 'JSON-LD' and back into 'EML.' Doing so creates a list-based representation of 'EML' in R, so that 'EML' data can easily be manipulated using standard 'R' tools. This makes this package an effective backend for other 'R'-based tools working with 'EML.' By abstracting away the complexity of 'XML' Schema, developers can build around native 'R' list objects and not have to worry about satisfying many of the additional constraints of set by the schema (such as element ordering, which is handled automatically). Additionally, the 'JSON-LD' representation enables the use of developer-friendly 'JSON' parsing and serialization that may facilitate the use of 'EML' in contexts outside of 'R,' as well as the informatics-friendly serializations such as 'RDF' and 'SPARQL' queries.

Maintained by Carl Boettiger. Last updated 4 years ago.

0.5 match 13 stars 7.63 score 69 scripts 8 dependents

wojcieko

bhmbasket:Bayesian Hierarchical Models for Basket Trials

Provides functions for the evaluation of basket trial designs with binary endpoints. Operating characteristics of a basket trial design are assessed by simulating trial data according to scenarios, analyzing the data with Bayesian hierarchical models (BHMs), and assessing decision probabilities on stratum and trial-level based on Go / No-go decision making. The package is build for high flexibility regarding decision rules, number of interim analyses, number of strata, and recruitment. The BHMs proposed by Berry et al. (2013) <doi:10.1177/1740774513497539> and Neuenschwander et al. (2016) <doi:10.1002/pst.1730>, as well as a model that combines both approaches are implemented. Functions are provided to implement Bayesian decision rules as for example proposed by Fisch et al. (2015) <doi:10.1177/2168479014533970>. In addition, posterior point estimates (mean/median) and credible intervals for response rates and some model parameters can be calculated. For simulated trial data, bias and mean squared errors of posterior point estimates for response rates can be provided.

Maintained by Stephan Wojciekowski. Last updated 3 years ago.

jags cpp

1.3 match 1 stars 2.78 score 5 scripts 1 dependents

cran

halk:Methods to Create Hierarchical Age Length Keys for Age Assignment

Provides methods for implementing hierarchical age length keys to estimate fish ages from lengths using data borrowing. Users can create hierarchical age length keys and use them to assign ages given length.

Maintained by Paul Frater. Last updated 1 years ago.

1.8 match 2.00 score 4 scripts

vimc

vaultr:Vault Client for Secrets and Sensitive Data

Provides an interface to a 'HashiCorp' vault server over its http API (typically these are self-hosted; see <https://www.vaultproject.io>). This allows for secure storage and retrieval of secrets over a network, such as tokens, passwords and certificates. Authentication with vault is supported through several backends including user name/password and authentication via 'GitHub'.

Maintained by Rich FitzJohn. Last updated 1 years ago.

0.5 match 24 stars 6.78 score 2 scripts 1 dependents

statismike

shiny.reglog:Optional Login and Registration Module System for ShinyApps

RegLog system provides a set of shiny modules to handle register procedure for your users, alongside with login, edit credentials and password reset functionality. It provides support for popular SQL databases and optionally googlesheet-based database for easy setup. For email sending it provides support for 'emayili' and 'gmailr' backends. Architecture makes customizing usability pretty straightforward. The authentication system created with shiny.reglog is designed to be optional: user don't need to be logged-in to access your application, but when logged-in the user data can be used to read from and write to relational databases.

Maintained by Michal Kosinski. Last updated 3 years ago.

googlesheet register-ui shiny-applications sqlite

0.5 match 14 stars 6.45 score 20 scripts

cran4linux

bspm:Bridge to System Package Manager

Enables binary package installations on Linux distributions. Provides functions to manage packages via the distribution's package manager. Also provides transparent integration with R's install.packages() and a fallback mechanism. When installed as a system package, interacts with the system's package manager without requiring administrative privileges via an integrated D-Bus service; otherwise, uses sudo. Currently, the following backends are supported: DNF, APT, ALPM.

Maintained by Iñaki Ucar. Last updated 5 months ago.

automation linux packages

0.5 match 82 stars 6.19 score 2 scripts

ropensci

chopin:Computation of Spatial Data by Hierarchical and Objective Partitioning of Inputs for Parallel Processing

Geospatial data computation is parallelized by grid, hierarchy, or raster files. Based on future and mirai parallel backends, terra and sf functions as well as convenience functions in the package can be distributed over multiple threads. The simplest way of parallelizing generic geospatial computation is to start from `par_pad_*` functions to `par_grid`, `par_hierarchy`, or `par_multirasters` functions. Virtually any functions accepting classes in terra or sf packages can be used in the three parallelization functions. A common raster-vector overlay operation is provided as a function `extract_at`, which uses exactextractr, with options for kernel weights for summarizing raster values at vector geometries. Other convenience functions for vector-vector operations including simple areal interpolation (`summarize_aw`) and summation of exponentially decaying weights (`summarize_sedc`) are also provided.

Maintained by Insang Song. Last updated 15 days ago.

0.5 match 16 stars 6.11 score 23 scripts

rapporter

rapport:A Report Templating System

Facilitating the creation of reproducible statistical report templates. Once created, rapport templates can be exported to various external formats (HTML, LaTeX, PDF, ODT etc.) with pandoc as the converter backend.

Maintained by Gergely Daróczi. Last updated 4 years ago.

0.5 match 49 stars 5.81 score 66 scripts

cjerzak

fastrerandomize:Hardware-Accelerated Rerandomization for Improved Balance

Provides hardware-accelerated tools for performing rerandomization and randomization testing in experimental research. Using a 'JAX' backend, the package enables exact rerandomization inference even for large experiments with hundreds of billions of possible randomizations. Key functionalities include generating pools of acceptable rerandomizations based on covariate balance, conducting exact randomization tests, and performing pre-analysis evaluations to determine optimal rerandomization acceptance thresholds. The package supports various hardware acceleration frameworks including 'CPU', 'CUDA', and 'METAL', making it versatile across accelerated computing environments. This allows researchers to efficiently implement stringent rerandomization designs and conduct valid inference even with large sample sizes. The package is partly based on Jerzak and Goldstein (2023) <doi:10.48550/arXiv.2310.00861>.

Maintained by Connor Jerzak. Last updated 1 months ago.

balance experimental-design hardware-acceleration

0.5 match 8 stars 5.64 score 1 scripts

alexgenin

chouca:A Stochastic Cellular Automaton Engine

An engine for stochastic cellular automata. It provides a high-level interface to declare a model, which can then be simulated by various backends (Genin et al. (2023) <doi:10.1101/2023.11.08.566206>).

Maintained by Alexandre Genin. Last updated 7 months ago.

cpp

0.5 match 10 stars 5.48 score 3 scripts

steveweston

doMPI:Foreach Parallel Adaptor for the Rmpi Package

Provides a parallel backend for the %dopar% function using the Rmpi package.

Maintained by Steve Weston. Last updated 8 years ago.

0.6 match 3 stars 4.96 score 354 scripts 1 dependents

crunch-io

crplyr:A 'dplyr' Interface for Crunch

In order to facilitate analysis of datasets hosted on the Crunch data platform <https://crunch.io/>, the 'crplyr' package implements 'dplyr' methods on top of the Crunch backend. The usual methods 'select', 'filter', 'group_by', 'summarize', and 'collect' are implemented in such a way as to perform as much computation on the server and pull as little data locally as possible.

Maintained by Greg Freedman Ellis. Last updated 2 years ago.

0.5 match 6 stars 5.41 score 17 scripts

nikolaus77

rocker:Database Interface Class

'R6' class interface for handling relational database connections using 'DBI' package as backend. The class allows handling of connections to e.g. PostgreSQL, MariaDB and SQLite. The purpose is having an intuitive object allowing straightforward handling of SQL databases.

Maintained by Nikolaus Pawlowski. Last updated 3 years ago.

database dbi mariadb mysql postgres postgresql r6 sql sqlite

0.5 match 5 stars 5.24 score 7 scripts

kylebaron

mrgsim.parallel:Simulate with 'mrgsolve' in Parallel

Simulation from an 'mrgsolve' <https://cran.r-project.org/package=mrgsolve> model using a parallel backend. Input data sets are split (chunked) and simulated in parallel using mclapply() or future_lapply() <https://cran.r-project.org/package=future.apply>.

Maintained by Kyle Baron. Last updated 3 months ago.

future mrgsolve parallelization

0.5 match 5 stars 5.11 score 17 scripts

glotaran

TIMP:Fitting Separable Nonlinear Models in Spectroscopy and Microscopy

A problem solving environment (PSE) for fitting separable nonlinear models to measurements arising in physics and chemistry experiments, as described by Mullen & van Stokkum (2007) <doi:10.18637/jss.v018.i03> for its use in fitting time resolved spectroscopy data, and as described by Laptenok et al. (2007) <doi:10.18637/jss.v018.i08> for its use in fitting Fluorescence Lifetime Imaging Microscopy (FLIM) data, in the study of Förster Resonance Energy Transfer (FRET). `TIMP` also serves as the computation backend for the `GloTarAn` software, a graphical user interface for the package, as described in Snellenburg et al. (2012) <doi:10.18637/jss.v049.i03>.

Maintained by Joris Snellenburg. Last updated 2 years ago.

parameter-estimation

0.5 match 3 stars 4.80 score 14 scripts 1 dependents

dyfanjones

heck:Highly Performant String Case Converter

Provides a case conversion between common cases like CamelCase and snake_case. Using the 'rust crate heck' <https://github.com/withoutboats/heck> as the backend for a highly performant case conversion for 'R'.

Maintained by Dyfan Jones. Last updated 6 months ago.

rust cargo

0.5 match 7 stars 4.54 score 3 scripts

patzaw

ClickHouseHTTP:A Simple HTTP Database Interface to 'ClickHouse'

'ClickHouse' (<https://clickhouse.com/>) is an open-source, high performance columnar OLAP (online analytical processing of queries) database management system for real-time analytics using SQL. This 'DBI' backend relies on the 'ClickHouse' HTTP interface and support HTTPS protocol.

Maintained by Patrice Godard. Last updated 6 months ago.

0.5 match 16 stars 4.46 score 12 scripts 1 dependents

r-dbi

dblog:Logging for DBI

Provides logging of DBI methods for arbitrary backends.

Maintained by Kirill Müller. Last updated 3 months ago.

database dbi logging

0.6 match 9 stars 3.83 score 7 scripts

thinkr-open

bank:Extra Cache

More cache backends.

Maintained by Colin Fay. Last updated 2 years ago.

cache golemverse shiny

0.8 match 12 stars 2.78 score 6 scripts

dmolitor

bolasso:Model Consistent Lasso Estimation Through the Bootstrap

Implements the bolasso algorithm for consistent variable selection and estimation accuracy. Includes support for many parallel backends via the future package. For details see: Bach (2008), 'Bolasso: model consistent Lasso estimation through the bootstrap', <doi:10.48550/arXiv.0804.1302>.

Maintained by Daniel Molitor. Last updated 3 months ago.

bolasso bootstrap lasso variable-selection

0.5 match 4 stars 4.00 score 7 scripts

rudolfjagdhuber

ExhaustiveSearch:A Fast and Scalable Exhaustive Feature Selection Framework

The goal of this package is to provide an easy to use, fast and scalable exhaustive search framework. Exhaustive feature selections typically require a very large number of models to be fitted and evaluated. Execution speed and memory management are crucial factors here. This package provides solutions for both. Execution speed is optimized by using a multi-threaded C++ backend, and memory issues are solved by by only storing the best results during execution and thus keeping memory usage constant.

Maintained by Rudolf Jagdhuber. Last updated 4 years ago.

aic exhaustive-search feature-selection linear-regression logistic-regression machine-learning model-selection mse openblas cpp

0.5 match 4 stars 3.60 score 5 scripts

mrc-ide

queuer:Queue Tasks

Queue tasks to number of backends.

Maintained by Rich FitzJohn. Last updated 2 years ago.

cluster infrastructure

0.6 match 4 stars 2.78 score 4 scripts

diegoefe

shinydbauth:Simple Authentification for 'shiny' Applications

Provides a simple authentification mechanism for single 'shiny' applications. Authentification and password change functionality are performed calling user provided functions that typically access some database backend. Source code of main applications is protected until authentication is successful.

Maintained by Diego Florio. Last updated 8 months ago.

0.5 match 3 stars 3.18 score 1 scripts

jszitas

categoryEncodings:Category Variable Encodings

Simple, fast, and automatic encodings for category data using a data.table backend. Most of the methods are an implementation of Johannemann, Hadad, Athey, Wager (2019) <arXiv:1908.09874>, particularly their 'means', "sPCA", "low-rank" and "multinomial logit".

Maintained by Juraj Szitas. Last updated 3 years ago.

categorical-variables feature-encoding feature-engineering

0.5 match 3 stars 3.18 score 2 scripts

leandroroser

chunkR:Read Tables in Chunks

Read tables chunk by chunk using a C++ backend and a simple R interface.

Maintained by Leandro Roser. Last updated 7 years ago.

cpp

0.6 match 2.70 score 3 scripts

henrikbengtsson

doFuture.tests.extra:Extra Test Sets for the 'doFuture' Package

Runs examples of packages that use 'foreach' and '%dopar%' for parallelization, where 'doFuture' is used as the 'foreach' adapter making it possible to use any future backend for parallelization. The package tests use these tools to test 'doFuture' with 'foreach'-based examples from packages 'BiocParallel', 'caret', 'doParallel', 'glmnet', 'NMF', 'plyr', and 'TSP'. These tests are run with many known future backends.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

futureverse parallel testsuite

0.8 match 1.70 score

theussl

DSL:Distributed Storage and List

An abstract DList class helps storing large list-type objects in a distributed manner. Corresponding high-level functions and methods for handling distributed storage (DStorage) and lists allows for processing such DLists on distributed systems efficiently. In doing so it uses a well defined storage backend implemented based on the DStorage class.

Maintained by Stefan Theussl. Last updated 5 years ago.

0.5 match 2.48 score 9 scripts 1 dependents

pigian

snap:Simple Neural Application

A simple wrapper to easily design vanilla deep neural networks using 'Tensorflow'/'Keras' backend for regression, classification and multi-label tasks, with some tweaks and tricks (skip shortcuts, embedding, feature selection and anomaly detection).

Maintained by Giancarlo Vercellino. Last updated 4 years ago.

0.5 match 2.00 score

dyfanjones

inih:Fast INI File Parser for R

Provide a simple 'INI' file parser to structured list format. Using the 'C' library 'inih' <https://github.com/benhoyt/inih> as backend of fast performance.

Maintained by Dyfan Jones. Last updated 3 months ago.

cpp ini cpp

0.5 match 1 stars 1.70 score

tskendal

vowels:Vowel Manipulation, Normalization, and Plotting

Procedures for the manipulation, normalization, and plotting of phonetic and sociophonetic vowel formant data. vowels is the backend for the NORM website.

Maintained by Tyler Kendall. Last updated 7 years ago.

0.5 match 1.48 score 30 scripts