R-universe search: needs:rprojroot

rstudio

reticulate:Interface to 'Python'

Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.

Maintained by Tomasz Kalinowski. Last updated 6 days ago.

cpp

1.7k stars 21.02 score 18k scripts 434 dependents

r-lib

testthat:Unit Testing for R

Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.

Maintained by Hadley Wickham. Last updated 1 months ago.

unit-testing cpp

900 stars 20.99 score 74k scripts 471 dependents

r-lib

here:A Simpler Way to Find Your Files

Constructs paths to your project's files. Declare the relative path of a file within your project with 'i_am()'. Use the 'here()' function as a drop-in replacement for 'file.path()', it will always locate the files relative to your project root.

Maintained by Kirill Müller. Last updated 15 days ago.

project

417 stars 19.62 score 96k scripts 607 dependents

r-lib

devtools:Tools to Make Developing R Packages Easier

Collection of package development tools.

Maintained by Jennifer Bryan. Last updated 6 months ago.

package-creation

2.4k stars 19.55 score 51k scripts 150 dependents

r-lib

roxygen2:In-Line Documentation for R

Generate your Rd documentation, 'NAMESPACE' file, and collation field using specially formatted comments. Writing documentation in-line with code makes it easier to keep your documentation up-to-date as your requirements change. 'roxygen2' is inspired by the 'Doxygen' system for C++.

Maintained by Hadley Wickham. Last updated 8 months ago.

devtools documentation cpp

606 stars 18.51 score 2.3k scripts 219 dependents

r-lib

usethis:Automate Package and Project Setup

Automate package and project setup tasks that are otherwise performed manually. This includes setting up unit testing, test coverage, continuous integration, Git, 'GitHub', licenses, 'Rcpp', 'RStudio' projects, and more.

Maintained by Jennifer Bryan. Last updated 26 days ago.

github setup

869 stars 17.54 score 5.6k scripts 336 dependents

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

2.4k stars 16.86 score 50k scripts 73 dependents

r-lib

styler:Non-Invasive Pretty Printing of R Code

Pretty-prints R code without changing the user's formatting intent.

Maintained by Lorenz Walthert. Last updated 2 months ago.

pretty-print

754 stars 16.15 score 940 scripts 62 dependents

rstudio

tensorflow:R Interface to 'TensorFlow'

Interface to 'TensorFlow' <https://www.tensorflow.org/>, an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more 'CPUs' or 'GPUs' in a desktop, server, or mobile device with a single 'API'. 'TensorFlow' was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

Maintained by Tomasz Kalinowski. Last updated 5 days ago.

1.3k stars 15.47 score 3.2k scripts 75 dependents

philchalmers

mirt:Multidimensional Item Response Theory

Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.

Maintained by Phil Chalmers. Last updated 4 days ago.

irt mirt openblas cpp openmp

212 stars 14.93 score 2.5k scripts 40 dependents

rstudio

learnr:Interactive Tutorials for R

Create interactive tutorials using R Markdown. Use a combination of narrative, figures, videos, exercises, and quizzes to create self-paced tutorials for learning about R and R packages.

Maintained by Garrick Aden-Buie. Last updated 7 months ago.

interactive python rmarkdown shiny sql teaching tutorial

713 stars 14.79 score 6.5k scripts 27 dependents

thinkr-open

golem:A Framework for Robust Shiny Applications

An opinionated framework for building a production-ready 'Shiny' application. This package contains a series of tools for building a robust 'Shiny' application from start to finish.

Maintained by Colin Fay. Last updated 7 months ago.

golemverse hacktoberfest shiny shiny-apps shiny-r shinyapps

921 stars 14.21 score 167 scripts 63 dependents

r-lib

pkgload:Simulate Package Installation and Attach

Simulates the process of installing a package and then attaching it. This is a key part of the 'devtools' package as it allows you to rapidly iterate while developing a package.

Maintained by Lionel Henry. Last updated 6 days ago.

59 stars 14.21 score 112 scripts 590 dependents

r-spatial

rgee:R Bindings for Calling the 'Earth Engine' API

Earth Engine <https://earthengine.google.com/> client library for R. All of the 'Earth Engine' API classes, modules, and functions are made available. Additional functions implemented include importing (exporting) of Earth Engine spatial objects, extraction of time series, interactive map display, assets management interface, and metadata display. See <https://r-spatial.github.io/rgee/> for further details.

Maintained by Cesar Aybar. Last updated 4 days ago.

earth-engine earthengine google-earth-engine googleearthengine spatial-analysis spatial-data

717 stars 13.77 score 1.9k scripts 3 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 11 days ago.

845 stars 13.63 score 264 scripts 2 dependents

philchalmers

SimDesign:Structure for Organizing Monte Carlo Simulation Designs

Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.

Maintained by Phil Chalmers. Last updated 3 days ago.

monte-carlo-simulation simulation simulation-framework

62 stars 13.41 score 253 scripts 47 dependents

oscarkjell

text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning

Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.

Maintained by Oscar Kjell. Last updated 9 days ago.

deep-learning machine-learning nlp transformers openjdk

145 stars 13.21 score 436 scripts 1 dependents

juba

questionr:Functions to Make Surveys Processing Easier

Set of functions to make the processing and analysis of surveys easier : interactive shiny apps and addins for data recoding, contingency tables, dataset metadata handling, and several convenience functions.

Maintained by Julien Barnier. Last updated 10 days ago.

83 stars 12.93 score 1.1k scripts 19 dependents

tkonopka

umap:Uniform Manifold Approximation and Projection

Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).

Maintained by Tomasz Konopka. Last updated 11 months ago.

dimensionality-reduction umap cpp

132 stars 12.82 score 3.6k scripts 45 dependents

insightsengineering

teal:Exploratory Web Apps for Analyzing Clinical Trials Data

A 'shiny' based interactive exploration framework for analyzing clinical trials data. 'teal' currently provides a dynamic filtering facility and different data viewers. 'teal' 'shiny' applications are built using standard 'shiny' modules.

Maintained by Dawid Kaledkowski. Last updated 1 months ago.

clinical-trials nest shiny webapp

206 stars 12.65 score 176 scripts 5 dependents

greta-dev

greta:Simple and Scalable Statistical Modelling in R

Write statistical models in R and fit them by MCMC and optimisation on CPUs and GPUs, using Google 'TensorFlow'. greta lets you write your own model like in BUGS, JAGS and Stan, except that you write models right in R, it scales well to massive datasets, and it’s easy to extend and build on. See the website for more information, including tutorials, examples, package documentation, and the greta forum.

Maintained by Nicholas Tierney. Last updated 20 days ago.

566 stars 12.53 score 396 scripts 6 dependents

r-lib

rcmdcheck:Run 'R CMD check' from 'R' and Capture Results

Run 'R CMD check' from 'R' and capture the results of the individual checks. Supports running checks in the background, timeouts, pretty printing and comparing check results.

Maintained by Gábor Csárdi. Last updated 6 months ago.

116 stars 12.34 score 102 scripts 158 dependents

openpharma

mmrm:Mixed Models for Repeated Measures

Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.

Maintained by Daniel Sabanes Bove. Last updated 24 days ago.

cpp

138 stars 12.15 score 113 scripts 4 dependents

rstudio

shinytest2:Testing for Shiny Applications

Automated unit testing of Shiny applications through a headless 'Chromium' browser.

Maintained by Barret Schloerke. Last updated 5 days ago.

cpp

108 stars 12.13 score 704 scripts 1 dependents

rstudio

tfruns:Training Run Tools for 'TensorFlow'

Create and manage unique directories for each 'TensorFlow' training run. Provides a unique, time stamped directory for each run along with functions to retrieve the directory of the latest run or latest several runs.

Maintained by Tomasz Kalinowski. Last updated 12 months ago.

34 stars 11.80 score 325 scripts 77 dependents

kevinushey

sourcetools:Tools for Reading, Tokenizing and Parsing R Code

Tools for Reading, Tokenizing and Parsing R Code.

Maintained by Kevin Ushey. Last updated 2 years ago.

cpp

78 stars 11.77 score 32 scripts 1.8k dependents

rstudio

sortable:Drag-and-Drop in 'shiny' Apps with 'SortableJS'

Enables drag-and-drop behaviour in Shiny apps, by exposing the functionality of the 'SortableJS' <https://sortablejs.github.io/Sortable/> JavaScript library as an 'htmlwidget'. You can use this in Shiny apps and widgets, 'learnr' tutorials as well as R Markdown. In addition, provides a custom 'learnr' question type - 'question_rank()' - that allows ranking questions with drag-and-drop.

Maintained by Andrie de Vries. Last updated 7 months ago.

htmlwidget

135 stars 11.62 score 368 scripts 13 dependents

r-lib

mockery:Mocking Library for R

The two main functionalities of this package are creating mock objects (functions) and selectively intercepting calls to a given function that originate in some other function. It can be used with any testing framework available for R. Mock objects can be injected with either this package's own stub() function or a similar with_mock() facility present in the 'testthat' package.

Maintained by Hadley Wickham. Last updated 1 years ago.

100 stars 11.57 score 504 scripts 5 dependents

tylermorganwall

rayshader:Create Maps and Visualize Data in 2D and 3D

Uses a combination of raytracing and multiple hill shading methods to produce 2D and 3D data visualizations and maps. Includes water detection and layering functions, programmable color palette generation, several built-in textures for hill shading, 2D and 3D plotting options, a built-in path tracer, 'Wavefront' OBJ file export, and the ability to save 3D visualizations to a 3D printable format.

Maintained by Tyler Morgan-Wall. Last updated 2 months ago.

cpp

2.1k stars 11.55 score 1.5k scripts 5 dependents

workflowr

workflowr:A Framework for Reproducible and Collaborative Data Science

Provides a workflow for your analysis projects by combining literate programming ('knitr' and 'rmarkdown') and version control ('Git', via 'git2r') to generate a website containing time-stamped, versioned, and documented results.

Maintained by John Blischak. Last updated 4 months ago.

git project-management rmarkdown website workflow

848 stars 11.53 score 566 scripts

r-hub

rhub:Tools for R Package Developers

R-hub v2 uses GitHub Actions to run 'R CMD check' and similar package checks. The 'rhub' package helps you set up R-hub v2 for your R package, and start running checks.

Maintained by Gábor Csárdi. Last updated 24 days ago.

359 stars 11.33 score 191 scripts 1 dependents

bioc

zellkonverter:Conversion Between scRNA-seq Objects

Provides methods to convert between Python AnnData objects and SingleCellExperiment objects. These are primarily intended for use by downstream Bioconductor packages that wrap Python methods for single-cell data analysis. It also includes functions to read and write H5AD files used for saving AnnData objects to disk.

Maintained by Luke Zappia. Last updated 22 days ago.

singlecell dataimport datarepresentation bioconductor conversion scrna-seq

159 stars 11.25 score 660 scripts 4 dependents

ropengov

eurostat:Tools for Eurostat Open Data

Tools to download data from the Eurostat database <https://ec.europa.eu/eurostat> together with search and manipulation utilities.

Maintained by Leo Lahti. Last updated 1 months ago.

ropengov eurostat eurostat-data

242 stars 11.07 score 892 scripts 4 dependents

pik-piam

madrat:May All Data be Reproducible and Transparent (MADRaT) *

Provides a framework which should improve reproducibility and transparency in data processing. It provides functionality such as automatic meta data creation and management, rudimentary quality management, data caching, work-flow management and data aggregation. * The title is a wish not a promise. By no means we expect this package to deliver everything what is needed to achieve full reproducibility and transparency, but we believe that it supports efforts in this direction.

Maintained by Jan Philipp Dietrich. Last updated 11 days ago.

15 stars 11.03 score 83 scripts 38 dependents

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

10.93 score 10k scripts 55 dependents

bioc

infercnv:Infer Copy Number Variation from Single-Cell RNA-Seq Data

Using single-cell RNA-Seq expression to visualize CNV in cells.

Maintained by Christophe Georgescu. Last updated 5 months ago.

software copynumbervariation variantdetection structuralvariation genomicvariation genetics transcriptomics statisticalmethod bayesian hiddenmarkovmodel singlecell jags cpp

601 stars 10.92 score 674 scripts

tylermorganwall

rayrender:Build and Raytrace 3D Scenes

Render scenes using pathtracing. Build 3D scenes out of spheres, cubes, planes, disks, triangles, cones, curves, line segments, cylinders, ellipsoids, and 3D models in the 'Wavefront' OBJ file format or the PLY Polygon File Format. Supports several material types, textures, multicore rendering, and tone-mapping. Based on the "Ray Tracing in One Weekend" book series. Peter Shirley (2018) <https://raytracing.github.io>.

Maintained by Tyler Morgan-Wall. Last updated 4 days ago.

libx11 cpp

631 stars 10.87 score 188 scripts 8 dependents

marce10

warbleR:Streamline Bioacoustic Analysis

Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.

Maintained by Marcelo Araya-Salas. Last updated 2 months ago.

animal-acoustic-signals audio-processing bioacoustics spectrogram streamline-analysis cpp

56 stars 10.86 score 270 scripts 4 dependents

friendly

vcdExtra:'vcd' Extensions and Additions

Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.

Maintained by Michael Friendly. Last updated 7 days ago.

categorical-data-visualization generalized-linear-models mosaic-plots

24 stars 10.85 score 472 scripts 3 dependents

r-lib

vdiffr:Visual Regression Testing and Graphical Diffing

An extension to the 'testthat' package that makes it easy to add graphical unit tests. It provides a Shiny application to manage the test cases.

Maintained by Lionel Henry. Last updated 5 months ago.

ggplot2 graphics testthat libpng cpp

191 stars 10.84 score 254 scripts 5 dependents

moodymudskipper

flow:View and Browse Code Using Flow Diagrams

Visualize as flow diagrams the logic of functions, expressions or scripts in a static way or when running a call, visualize the dependencies between functions or between modules in a shiny app, and more.

Maintained by Antoine Fabri. Last updated 4 months ago.

405 stars 10.84 score 61 scripts

rstudio

pointblank:Data Validation and Organization of Metadata for Local and Remote Tables

Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.

Maintained by Richard Iannone. Last updated 5 days ago.

data-assertions data-checker data-dictionaries data-frames data-inference data-management data-profiler data-quality data-validation data-verification database-tables easy-to-understand reporting-tool schema-validation testing-tools yaml-configuration

942 stars 10.73 score 284 scripts

quanteda

spacyr:Wrapper to the 'spaCy' 'NLP' Library

An R wrapper to the 'Python' 'spaCy' 'NLP' library, from <https://spacy.io>.

Maintained by Kenneth Benoit. Last updated 2 months ago.

extract-entities nlp spacy speech-tagging

253 stars 10.68 score 408 scripts 6 dependents

caseyyoungflesh

MCMCvis:Tools to Visualize, Manipulate, and Summarize MCMC Output

Performs key functions for MCMC analysis using minimal code - visualizes, manipulates, and summarizes MCMC output. Functions support simple and straightforward subsetting of model parameters within the calls, and produce presentable and 'publication-ready' output. MCMC output may be derived from Bayesian model output fit with Stan, NIMBLE, JAGS, and other software.

Maintained by Casey Youngflesh. Last updated 4 months ago.

38 stars 10.52 score 1.8k scripts 5 dependents

thinkr-open

attachment:Deal with Dependencies

Manage dependencies during package development. This can retrieve all dependencies that are used in ".R" files in the "R/" directory, in ".Rmd" files in "vignettes/" directory and in 'roxygen2' documentation of functions. There is a function to update the "DESCRIPTION" file of your package with 'CRAN' packages or any other remote package. All functions to retrieve dependencies of ".R" scripts and ".Rmd" or ".qmd" files can be used independently of a package development.

Maintained by Vincent Guyader. Last updated 17 days ago.

hacktoberfest

110 stars 10.44 score 48 scripts 5 dependents

ropensci

goodpractice:Advice on R Package Building

Give advice about good practices when building R packages. Advice includes functions and syntax to avoid, package structure, code complexity, code formatting, etc.

Maintained by Mark Padgham. Last updated 4 months ago.

467 stars 10.32 score 79 scripts 2 dependents

rstudio

blastula:Easily Send HTML Email Messages

Compose and send out responsive HTML email messages that render perfectly across a range of email clients and device sizes. Helper functions let the user insert embedded images, web link buttons, and 'ggplot2' plot objects into the message body. Messages can be sent through an 'SMTP' server, through the 'Posit Connect' service, or through the 'Mailgun' API service <https://www.mailgun.com/>.

Maintained by Richard Iannone. Last updated 9 months ago.

easy-to-use email html markdown responsive-email smtp

552 stars 10.27 score 348 scripts 5 dependents

facebookexperimental

Robyn:Semi-Automated Marketing Mix Modeling (MMM) from Meta Marketing Science

Semi-Automated Marketing Mix Modeling (MMM) aiming to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocation and diminishing returns curves and allows ground-truth calibration to account for causation.

Maintained by Gufeng Zhou. Last updated 13 days ago.

adstocking budget-allocation cost-response-curve econometrics evolutionary-algorithm gradient-based-optimisation hyperparameter-optimization marketing-mix-modeling marketing-mix-modelling marketing-science mmm ridge-regression

1.3k stars 10.27 score 95 scripts

insightsengineering

teal.modules.clinical:'teal' Modules for Standard Clinical Outputs

Provides user-friendly tools for creating and customizing clinical trial reports. By leveraging the 'teal' framework, this package provides 'teal' modules to easily create an interactive panel that allows for seamless adjustments to data presentation, thereby streamlining the creation of detailed and accurate reports.

Maintained by Dawid Kaledkowski. Last updated 1 months ago.

clinical-trials modules nest outputs shiny

35 stars 10.21 score 149 scripts

bioc

singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data

The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.

Maintained by Joshua David Campbell. Last updated 1 months ago.

singlecell geneexpression differentialexpression alignment clustering immunooncology batcheffect normalization qualitycontrol dataimport gui

182 stars 10.17 score 252 scripts

8-bit-sheep

googleAnalyticsR:Google Analytics API into R

Interact with the Google Analytics APIs <https://developers.google.com/analytics/>, including the Core Reporting API (v3 and v4), Management API, User Activity API GA4's Data API and Admin API and Multi-Channel Funnel API.

Maintained by Erik Grönroos. Last updated 7 months ago.

analytics api google googleanalyticsr googleauthr

262 stars 10.11 score 680 scripts 1 dependents

constantamateur

SoupX:Single Cell mRNA Soup eXterminator

Quantify, profile and remove ambient mRNA contamination (the "soup") from droplet based single cell RNA-seq experiments. Implements the method described in Young et al. (2018) <doi:10.1101/303727>.

Maintained by Matthew Daniel Young. Last updated 2 years ago.

266 stars 10.08 score 594 scripts 1 dependents

ropensci

vcr:Record 'HTTP' Calls to Disk

Record test suite 'HTTP' requests and replays them during future runs. A port of the Ruby gem of the same name (<https://github.com/vcr/vcr/>). Works by hooking into the 'webmockr' R package for matching 'HTTP' requests by various rules ('HTTP' method, 'URL', query parameters, headers, body, etc.), and then caching real 'HTTP' responses on disk in 'cassettes'. Subsequent 'HTTP' requests matching any previous requests in the same 'cassette' use a cached 'HTTP' response.

Maintained by Scott Chamberlain. Last updated 27 days ago.

http https api web-services curl mock mocking http-mocking testing testing-tools tdd unit-testing vcr

77 stars 10.06 score 165 scripts

bioc

MOFA2:Multi-Omics Factor Analysis v2

The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.

Maintained by Ricard Argelaguet. Last updated 5 months ago.

dimensionreduction bayesian visualization factor-analysis mofa multi-omics

326 stars 10.03 score 502 scripts

pecanproject

PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Istem Fer. Last updated 8 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.97 score 20 scripts 2 dependents

reditorsupport

languageserver:Language Server Protocol

An implementation of the Language Server Protocol for R. The Language Server protocol is used by an editor client to integrate features like auto completion. See <https://microsoft.github.io/language-server-protocol/> for details.

Maintained by Randy Lai. Last updated 1 years ago.

language-server-protocol

607 stars 9.93 score 207 scripts 1 dependents

dataoneorg

dataone:R Interface to the DataONE REST API

Provides read and write access to data and metadata from the DataONE network <https://www.dataone.org> of data repositories. Each DataONE repository implements a consistent repository application programming interface. Users call methods in R to access these remote repository functions, such as methods to query the metadata catalog, get access to metadata for particular data packages, and read the data objects from the data repository. Users can also insert and update data objects on repositories that support these methods.

Maintained by Matthew B. Jones. Last updated 3 years ago.

36 stars 9.93 score 472 scripts 3 dependents

r-lib

ymlthis:Write 'YAML' for 'R Markdown', 'bookdown', 'blogdown', and More

Write 'YAML' front matter for R Markdown and related documents. Work with 'YAML' objects more naturally and write the resulting 'YAML' to your clipboard or to 'YAML' files related to your project.

Maintained by Malcolm Barrett. Last updated 3 years ago.

165 stars 9.92 score 196 scripts 14 dependents

n8thangreen

BCEA:Bayesian Cost Effectiveness Analysis

Produces an economic evaluation of a sample of suitable variables of cost and effectiveness / utility for two or more interventions, e.g. from a Bayesian model in the form of MCMC simulations. This package computes the most cost-effective alternative and produces graphical summaries and probabilistic sensitivity analysis, see Baio et al (2017) <doi:10.1007/978-3-319-55718-2>.

Maintained by Gianluca Baio. Last updated 2 months ago.

bayesian cost-effectiveness

3 stars 9.90 score 243 scripts 3 dependents

rstudio

distill:'R Markdown' Format for Scientific and Technical Writing

Scientific and technical article format for the web. 'Distill' articles feature attractive, reader-friendly typography, flexible layout options for visualizations, and full support for footnotes and citations.

Maintained by Christophe Dervieux. Last updated 1 years ago.

423 stars 9.85 score 402 scripts 6 dependents

statdivlab

corncob:Count Regression for Correlated Observations with the Beta-Binomial

Statistical modeling for correlated count data using the beta-binomial distribution, described in Martin et al. (2020) <doi:10.1214/19-AOAS1283>. It allows for both mean and overdispersion covariates.

Maintained by Amy D Willis. Last updated 13 days ago.

106 stars 9.82 score 248 scripts 1 dependents

cjvanlissa

worcs:Workflow for Open Reproducible Code in Science

Create reproducible and transparent research projects in 'R'. This package is based on the Workflow for Open Reproducible Code in Science (WORCS), a step-by-step procedure based on best practices for Open Science. It includes an 'RStudio' project template, several convenience functions, and all dependencies required to make your project reproducible and transparent. WORCS is explained in the tutorial paper by Van Lissa, Brandmaier, Brinkman, Lamprecht, Struiksma, & Vreede (2021). <doi:10.3233/DS-210031>.

Maintained by Caspar J. Van Lissa. Last updated 2 days ago.

83 stars 9.77 score 59 scripts 1 dependents

insightsengineering

teal.modules.general:General Modules for 'teal' Applications

Prebuilt 'shiny' modules containing tools for viewing data, visualizing data, understanding missing and outlier values within your data and performing simple data analysis. This extends 'teal' framework that supports reproducible research and analysis.

Maintained by Dawid Kaledkowski. Last updated 1 months ago.

general-purpose modules nest shiny

13 stars 9.74 score 71 scripts

lorenzwalthert

precommit:Pre-Commit Hooks

Useful git hooks for R building on top of the multi-language framework 'pre-commit' for hook management. This package provides git hooks for common tasks like formatting files with 'styler' or spell checking as well as wrapper functions to access the 'pre-commit' executable.

Maintained by Lorenz Walthert. Last updated 2 days ago.

git hooks pre-commit vcs workflow

255 stars 9.73 score 10 scripts

pecanproject

PEcAnRTM:PEcAn Functions Used for Radiative Transfer Modeling

Functions for performing forward runs and inversions of radiative transfer models (RTMs). Inversions can be performed using maximum likelihood, or more complex hierarchical Bayesian methods. Underlying numerical analyses are optimized for speed using Fortran code.

Maintained by Alexey Shiklomanov. Last updated 8 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants fortran jags cpp

216 stars 9.72 score 132 scripts

ropensci

rdflib:Tools to Manipulate and Query Semantic Data

The Resource Description Framework, or 'RDF' is a widely used data representation model that forms the cornerstone of the Semantic Web. 'RDF' represents data as a graph rather than the familiar data table or rectangle of relational databases. The 'rdflib' package provides a friendly and concise user interface for performing common tasks on 'RDF' data, such as reading, writing and converting between the various serializations of 'RDF' data, including 'rdfxml', 'turtle', 'nquads', 'ntriples', and 'json-ld'; creating new 'RDF' graphs, and performing graph queries using 'SPARQL'. This package wraps the low level 'redland' R package which provides direct bindings to the 'redland' C library. Additionally, the package supports the newer and more developer friendly 'JSON-LD' format through the 'jsonld' package. The package interface takes inspiration from the Python 'rdflib' library.

Maintained by Carl Boettiger. Last updated 8 months ago.

peer-reviewed

57 stars 9.59 score 123 scripts 7 dependents

jamovi

jmv:The 'jamovi' Analyses

A suite of common statistical methods such as descriptives, t-tests, ANOVAs, regression, correlation matrices, proportion tests, contingency tables, and factor analysis. This package is also useable from the 'jamovi' statistical spreadsheet (see <https://www.jamovi.org> for more information).

Maintained by Jonathon Love. Last updated 29 days ago.

59 stars 9.58 score 440 scripts

ndphillips

FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees

Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.

Maintained by Hansjoerg Neth. Last updated 5 months ago.

136 stars 9.53 score 144 scripts

stemangiola

tidyseurat:Brings Seurat to the Tidyverse

It creates an invisible layer that allow to see the 'Seurat' object as tibble and interact seamlessly with the tidyverse.

Maintained by Stefano Mangiola. Last updated 8 months ago.

assaydomain infrastructure rnaseq differentialexpression geneexpression normalization clustering qualitycontrol sequencing transcription transcriptomics dplyr ggplot2 pca purrr sct seurat single-cell single-cell-rna-seq tibble tidyr tidyverse transcripts tsne umap

159 stars 9.48 score 398 scripts 1 dependents

philchalmers

mirtCAT:Computerized Adaptive Testing with Multidimensional Item Response Theory

Provides tools to generate HTML interfaces for adaptive and non-adaptive tests using the shiny package (Chalmers (2016) <doi:10.18637/jss.v071.i05>). Suitable for applying unidimensional and multidimensional computerized adaptive tests (CAT) using item response theory methodology and for creating simple questionnaires forms to collect response data directly in R. Additionally, optimal test designs (e.g., "shadow testing") are supported for tests that contain a large number of item selection constraints. Finally, package contains tools useful for performing Monte Carlo simulations for studying test item banks.

Maintained by Phil Chalmers. Last updated 5 months ago.

cat irt openblas cpp

95 stars 9.47 score 62 scripts 3 dependents

dynverse

anndata:'anndata' for R

A 'reticulate' wrapper for the Python package 'anndata'. Provides a scalable way of keeping track of data and learned annotations. Used to read from and write to the h5ad file format.

Maintained by Robrecht Cannoodt. Last updated 26 days ago.

44 stars 9.47 score 772 scripts 3 dependents

nealrichardson

httptest:A Test Environment for HTTP Requests

Testing and documenting code that communicates with remote servers can be painful. Dealing with authentication, server state, and other complications can make testing seem too costly to bother with. But it doesn't need to be that hard. This package enables one to test all of the logic on the R sides of the API in your package without requiring access to the remote service. Importantly, it provides three contexts that mock the network connection in different ways, as well as testing functions to assert that HTTP requests were---or were not---made. It also allows one to safely record real API responses to use as test fixtures. The ability to save responses and load them offline also enables one to write vignettes and other dynamic documents that can be distributed without access to a live server.

Maintained by Neal Richardson. Last updated 1 years ago.

http mock test-framework

81 stars 9.46 score 276 scripts 1 dependents

thinkr-open

fusen:Build a Package from Rmarkdown Files

Use Rmarkdown First method to build your package. Start your package with documentation, functions, examples and tests in the same unique file. Everything can be set from the Rmarkdown template file provided in your project, then inflated as a package. Inflating the template copies the relevant chunks and sections in the appropriate files required for package development.

Maintained by Vincent Guyader. Last updated 2 months ago.

hacktoberfest rmd-first

163 stars 9.45 score 35 scripts

extendr

rextendr:Call Rust Code from R using the 'extendr' Crate

Provides functions to compile and load Rust code from R, similar to how 'Rcpp' or 'cpp11' allow easy interfacing with C++ code. Also provides helper functions to create R packages that use Rust code. Under the hood, the Rust crate 'extendr' is used to do all the heavy lifting.

Maintained by Ilia Kosenkov. Last updated 4 days ago.

207 stars 9.45 score 61 scripts

eagerai

fastai:Interface to 'fastai'

The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.

Maintained by Turgut Abdullayev. Last updated 12 months ago.

audio collaborative-filtering darknet darknet-image-classification fastai medical object-detection tabular text vision

118 stars 9.40 score 76 scripts

ropensci

DataPackageR:Construct Reproducible Analytic Data Sets as R Packages

A framework to help construct R data packages in a reproducible manner. Potentially time consuming processing of raw data sets into analysis ready data sets is done in a reproducible manner and decoupled from the usual 'R CMD build' process so that data sets can be processed into R objects in the data package and the data package can then be shared, built, and installed by others without the need to repeat computationally costly data processing. The package maintains data provenance by turning the data processing scripts into package vignettes, as well as enforcing documentation and version checking of included data objects. Data packages can be version controlled on 'GitHub', and used to share data for manuscripts, collaboration and reproducible research.

Maintained by Dave Slager. Last updated 7 months ago.

peer-reviewed reproducibility

156 stars 9.38 score 72 scripts

nealrichardson

httptest2:Test Helpers for 'httr2'

Testing and documenting code that communicates with remote servers can be painful. This package helps with writing tests for packages that use 'httr2'. It enables testing all of the logic on the R sides of the API without requiring access to the remote service, and it also allows recording real API responses to use as test fixtures. The ability to save responses and load them offline also enables writing vignettes and other dynamic documents that can be distributed without access to a live server.

Maintained by Neal Richardson. Last updated 9 months ago.

http mock testing

33 stars 9.37 score 95 scripts 1 dependents

briencj

asremlPlus:Augments 'ASReml-R' in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences

Assists in automating the selection of terms to include in mixed models when 'asreml' is used to fit the models. Procedures are available for choosing models that conform to the hierarchy or marginality principle, for fitting and choosing between two-dimensional spatial models using correlation, natural cubic smoothing spline and P-spline models. A history of the fitting of a sequence of models is kept in a data frame. Also used to compute functions and contrasts of, to investigate differences between and to plot predictions obtained using any model fitting function. The content falls into the following natural groupings: (i) Data, (ii) Model modification functions, (iii) Model selection and description functions, (iv) Model diagnostics and simulation functions, (v) Prediction production and presentation functions, (vi) Response transformation functions, (vii) Object manipulation functions, and (viii) Miscellaneous functions (for further details see 'asremlPlus-package' in help). The 'asreml' package provides a computationally efficient algorithm for fitting a wide range of linear mixed models using Residual Maximum Likelihood. It is a commercial package and a license for it can be purchased from 'VSNi' <https://vsni.co.uk/> as 'asreml-R', who will supply a zip file for local installation/updating (see <https://asreml.kb.vsni.co.uk/>). It is not needed for functions that are methods for 'alldiffs' and 'data.frame' objects. The package 'asremPlus' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 1 months ago.

asreml mixed-models

19 stars 9.37 score 200 scripts

cbielow

PTXQC:Quality Report Generation for MaxQuant and mzTab Results

Generates Proteomics (PTX) quality control (QC) reports for shotgun LC-MS data analyzed with the MaxQuant software suite (from .txt files) or mzTab files (ideally from OpenMS 'QualityControl' tool). Reports are customizable (target thresholds, subsetting) and available in HTML or PDF format. Published in J. Proteome Res., Proteomics Quality Control: Quality Control Software for MaxQuant Results (2015) <doi:10.1021/acs.jproteome.5b00780>.

Maintained by Chris Bielow. Last updated 1 years ago.

drag-and-drop hacktoberfest heatmap match-between-runs maxquant metric mztab openms proteomics quality-control quality-metrics report

42 stars 9.35 score 105 scripts 1 dependents

rstudio

tfdatasets:Interface to 'TensorFlow' Datasets

Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/guide> for additional details.

Maintained by Tomasz Kalinowski. Last updated 19 days ago.

34 stars 9.32 score 656 scripts 3 dependents

insightsengineering

teal.slice:Filter Module for 'teal' Applications

Data filtering module for 'teal' applications. Allows for interactive filtering of data stored in 'data.frame' and 'MultiAssayExperiment' objects. Also displays filtered and unfiltered observation counts.

Maintained by Dawid Kaledkowski. Last updated 2 months ago.

modules nest slice

11 stars 9.27 score 3 scripts 6 dependents

whipson

maestro:Orchestration of Data Pipelines

Framework for creating and orchestrating data pipelines. Organize, orchestrate, and monitor multiple pipelines in a single project. Use tags to decorate functions with scheduling parameters and configuration.

Maintained by Will Hipson. Last updated 6 days ago.

119 stars 9.20 score 150 scripts

insightsengineering

teal.widgets:'shiny' Widgets for 'teal' Applications

Collection of 'shiny' widgets to support 'teal' applications. Enables the manipulation of application layout and plot or table settings.

Maintained by Dawid Kaledkowski. Last updated 2 months ago.

nest shiny widgets

5 stars 9.14 score 34 scripts 8 dependents

bodkan

slendr:A Simulation Framework for Spatiotemporal Population Genetics

A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.

Maintained by Martin Petr. Last updated 3 days ago.

popgen population-genetics simulations spatial-statistics

56 stars 9.13 score 88 scripts

bioc

basilisk:Freezing Python Dependencies Inside Bioconductor Packages

Installs a self-contained conda instance that is managed by the R/Bioconductor installation machinery. This aims to provide a consistent Python environment that can be used reliably by Bioconductor packages. Functions are also provided to enable smooth interoperability of multiple Python environments in a single R session.

Maintained by Aaron Lun. Last updated 11 days ago.

infrastructure

9.12 score 75 scripts 39 dependents

stscl

gdverse:Analysis of Spatial Stratified Heterogeneity

Detecting spatial associations based on the concept of spatial stratified heterogeneity while also considering spatial dependencies, spatial interpretability, complex spatial interactions, and robust spatial stratification. In addition, it supports the spatial stratified heterogeneity family described in Lv et al. (2025)<doi:10.1111/tgis.70032>.

Maintained by Wenbo Lv. Last updated 2 days ago.

geographical-detector geoinformatics geospatial-analysis spatial-statistics spatial-stratified-heterogeneity cpp

33 stars 9.10 score 41 scripts 2 dependents

didiermurillof

FielDHub:A Shiny App for Design of Experiments in Life Sciences

A shiny design of experiments (DOE) app that aids in the creation of traditional, un-replicated, augmented and partially-replicated designs applied to agriculture, plant breeding, forestry, animal and biological sciences.

Maintained by Didier Murillo. Last updated 8 months ago.

agricultural breeding design doe experimental plantbreeding shiny

47 stars 9.09 score 70 scripts 1 dependents

bioc

BatchQC:Batch Effects Quality Control Software

Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.

Maintained by Jessica Anderson. Last updated 13 days ago.

batcheffect graphandnetwork microarray normalization principalcomponent sequencing software visualization qualitycontrol rnaseq preprocessing differentialexpression immunooncology

7 stars 9.06 score 54 scripts

pecanproject

PEcAn.all:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 8 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.02 score 266 scripts

rstudio

shinytest:Test Shiny Apps

Please see the shinytest to shinytest2 migration guide at <https://rstudio.github.io/shinytest2/articles/z-migration.html>.

Maintained by Winston Chang. Last updated 10 months ago.

225 stars 9.02 score 352 scripts

bioc

scPipe:Pipeline for single cell multi-omic data pre-processing

A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.

Maintained by Shian Su. Last updated 3 months ago.

immunooncology software sequencing rnaseq geneexpression singlecell visualization sequencematching preprocessing qualitycontrol genomeannotation dataimport curl bzip2 xz-utils zlib cpp

68 stars 9.02 score 84 scripts

r-spatial

link2GI:Linking Geographic Information Systems, Remote Sensing and Other Command Line Tools

Functions and tools for using open GIS and remote sensing command-line interfaces in a reproducible environment.

Maintained by Chris Reudenbach. Last updated 4 months ago.

26 stars 8.99 score 78 scripts 1 dependents

appsilon

rhino:A Framework for Enterprise Shiny Applications

A framework that supports creating and extending enterprise Shiny applications using best practices.

Maintained by Kamil Żyła. Last updated 3 days ago.

rhinoverse shiny

305 stars 8.99 score 145 scripts

pharmar

riskmetric:Risk Metrics to Evaluating R Packages

Facilities for assessing R packages against a number of metrics to help quantify their robustness.

Maintained by Eli Miller. Last updated 7 days ago.

166 stars 8.98 score 43 scripts

cjbarrie

academictwitteR:Access the Twitter Academic Research Product Track V2 API Endpoint

Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side.

Maintained by Christopher Barrie. Last updated 2 years ago.

twitter twitter-api

275 stars 8.94 score 177 scripts

guangchuangyu

badger:Badge for R Package

Query information and generate badge for using in README and GitHub Pages.

Maintained by Guangchuang Yu. Last updated 9 months ago.

badge

197 stars 8.92 score 225 scripts 5 dependents

azure

azuremlsdk:Interface to the 'Azure Machine Learning' 'SDK'

Interface to the 'Azure Machine Learning' Software Development Kit ('SDK'). Data scientists can use the 'SDK' to train, deploy, automate, and manage machine learning models on the 'Azure Machine Learning' service. To learn more about 'Azure Machine Learning' visit the website: <https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml>.

Maintained by Diondra Peck. Last updated 3 years ago.

amlcompute azure azure-machine-learning azureml dsi machine-learning rstudio sdk-r

105 stars 8.91 score 221 scripts

ajrgodfrey

BrailleR:Improved Access for Blind Users

Blind users do not have access to the graphical output from R without printing the content of graphics windows to an embosser of some kind. This is not as immediate as is required for efficient access to statistical output. The functions here are created so that blind people can make even better use of R. This includes the text descriptions of graphs, convenience functions to replace the functionality offered in many GUI front ends, and experimental functionality for optimising graphical content to prepare it for embossing as tactile images.

Maintained by A. Jonathan R. Godfrey. Last updated 12 months ago.

123 stars 8.90 score 143 scripts

tomkellygenetics

leiden:R Implementation of Leiden Clustering Algorithm

Implements the 'Python leidenalg' module to be called in R. Enables clustering using the leiden algorithm for partition a graph into communities. See the 'Python' repository for more details: <https://github.com/vtraag/leidenalg> Traag et al (2018) From Louvain to Leiden: guaranteeing well-connected communities. <arXiv:1810.08473>.

Maintained by S. Thomas Kelly. Last updated 10 months ago.

38 stars 8.89 score 180 scripts 3 dependents

pik-piam

remind2:The REMIND R package (2nd generation)

Contains the REMIND-specific routines for data and model output manipulation.

Maintained by Renato Rodrigues. Last updated 3 days ago.

8.87 score 161 scripts 5 dependents

dmi3kno

polite:Be Nice on the Web

Be responsible when scraping data from websites by following polite principles: introduce yourself, ask for permission, take slowly and never ask twice.

Maintained by Dmytro Perepolkin. Last updated 2 years ago.

crawler memoise rate-limiter robotstxt rvest scraper webscraping

327 stars 8.86 score 596 scripts 5 dependents

pecanproject

PEcAn.workflow:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides workhorse functions that can be used to run the major steps of a PEcAn analysis.

Maintained by David LeBauer. Last updated 8 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.85 score 15 scripts 4 dependents

rjournal

rjtools:Preparing, Checking, and Submitting Articles to the 'R Journal'

Create an 'R Journal' 'Rmarkdown' template article, that will generate html and pdf versions of your paper. Check that the paper folder has all the required components needed for submission. Examples of 'R Journal' publications can be found at <https://journal.r-project.org>.

Maintained by Di Cook. Last updated 2 months ago.

33 stars 8.81 score 37 scripts 1 dependents

ropengov

regions:Processing Regional Statistics

Validating sub-national statistical typologies, re-coding across standard typologies of sub-national statistics, and making valid aggregate level imputation, re-aggregation, re-weighting and projection down to lower hierarchical levels to create meaningful data panels and time series.

Maintained by Daniel Antal. Last updated 3 years ago.

observatory regions ropengov statistics

12 stars 8.81 score 67 scripts 5 dependents

dvats

mcmcse:Monte Carlo Standard Errors for MCMC

Provides tools for computing Monte Carlo standard errors (MCSE) in Markov chain Monte Carlo (MCMC) settings. MCSE computation for expectation and quantile estimators is supported as well as multivariate estimations. The package also provides functions for computing effective sample size and for plotting Monte Carlo estimates versus sample size.

Maintained by Dootika Vats. Last updated 2 months ago.

effective-sample-size mcmc output-a openblas cpp

12 stars 8.77 score 314 scripts 17 dependents

pecanproject

PEcAn.data.remote:PEcAn Functions Used for Extracting Remote Sensing Data

PEcAn module for processing remote data. Python module requirements: requests, json, re, ast, panads, sys. If any of these modules are missing, install using pip install <module name>.

Maintained by Bailey Morrison. Last updated 8 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 8.77 score 6 scripts 5 dependents

insightsengineering

rbmi:Reference Based Multiple Imputation

Implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). In particular, this package supports deterministic conditional mean imputation and jackknifing as described in Wolbers et al. (2022) <doi:10.1002/pst.2234>, Bayesian multiple imputation as described in Carpenter et al. (2013) <doi:10.1080/10543406.2013.834911>, and bootstrapped maximum likelihood imputation as described in von Hippel and Bartlett (2021) <doi: 10.1214/20-STS793>.

Maintained by Isaac Gravestock. Last updated 1 months ago.

18 stars 8.76 score 33 scripts 1 dependents

cynkra

fledge:Smoother Change Tracking and Versioning for R Packages

Streamlines the process of updating changelogs (NEWS.md) and versioning R packages developed in git repositories.

Maintained by Kirill Müller. Last updated 3 months ago.

changelog git package-creation

188 stars 8.73 score 10 scripts

bioc

memes:motif matching, comparison, and de novo discovery using the MEME Suite

A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.

Maintained by Spencer Nystrom. Last updated 5 months ago.

dataimport functionalgenomics generegulation motifannotation motifdiscovery sequencematching software

50 stars 8.69 score 117 scripts 1 dependents

rstudio

tfprobability:Interface to 'TensorFlow Probability'

Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

54 stars 8.63 score 221 scripts 3 dependents

t-kalinowski

tfautograph:Autograph R for 'Tensorflow'

Translate R control flow expressions into 'Tensorflow' graphs.

Maintained by Tomasz Kalinowski. Last updated 2 years ago.

autograph tensorflow

18 stars 8.62 score 145 scripts 75 dependents

ropensci

datapack:A Flexible Container to Transport and Manipulate Data and Associated Resources

Provides a flexible container to transport and manipulate complex sets of data. These data may consist of multiple data files and associated meta data and ancillary files. Individual data objects have associated system level meta data, and data files are linked together using the OAI-ORE standard resource map which describes the relationships between the files. The OAI- ORE standard is described at <https://www.openarchives.org/ore/>. Data packages can be serialized and transported as structured files that have been created following the BagIt specification. The BagIt specification is described at <https://tools.ietf.org/html/draft-kunze-bagit-08>.

Maintained by Matthew B. Jones. Last updated 3 years ago.

43 stars 8.55 score 195 scripts 4 dependents

ppbds

tutorial.helpers:Helper Functions for Creating Tutorials

Helper functions for creating, editing, and testing tutorials created with the 'learnr' package. Provides a simple method for allowing students to download their answers to tutorial questions. For examples of its use, see the 'r4ds.tutorials' package.

Maintained by David Kane. Last updated 14 days ago.

5 stars 8.50 score 152 scripts 1 dependents

bioc

BgeeDB:Annotation and gene expression data retrieval from Bgee database. TopAnat, an anatomical entities Enrichment Analysis tool for UBERON ontology

A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.

Maintained by Julien Wollbrett. Last updated 5 months ago.

software dataimport sequencing geneexpression microarray go genesetenrichment bioinformatics enrichment-analysis rna-seq scrna-seq single-cell

15 stars 8.46 score 19 scripts 1 dependents

samuel-marsh

scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing

Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.

Maintained by Samuel Marsh. Last updated 3 months ago.

customization ggplot2 scrna-seq seurat single-cell single-cell-genomics single-cell-rna-seq visualization

246 stars 8.45 score 1.1k scripts

bioc

lefser:R implementation of the LEfSE method for microbiome biomarker discovery

lefser is the R implementation of the popular microbiome biomarker discovery too, LEfSe. It uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers from two-level classes (and optional sub-classes).

Maintained by Sehyun Oh. Last updated 1 months ago.

software sequencing differentialexpression microbiome statisticalmethod classification bioconductor-package r01ca230551

56 stars 8.44 score 56 scripts

rstudio

tfestimators:Interface to 'TensorFlow' Estimators

Interface to 'TensorFlow' Estimators <https://www.tensorflow.org/guide/estimator>, a high-level API that provides implementations of many different model types including linear models and deep neural networks.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

57 stars 8.42 score 170 scripts

bioc

projectR:Functions for the projection of weights from PCA, CoGAPS, NMF, correlation, and clustering

Functions for the projection of data into the spaces defined by PCA, CoGAPS, NMF, correlation, and clustering.

Maintained by Genevieve Stein-OBrien. Last updated 13 days ago.

functionalprediction generegulation biologicalquestion software

62 stars 8.42 score 70 scripts

insightsengineering

teal.transform:Functions for Extracting and Merging Data in the 'teal' Framework

A standardized user interface for column selection, that facilitates dataset merging in 'teal' framework.

Maintained by Dawid Kaledkowski. Last updated 2 months ago.

merge modules nest transform

3 stars 8.39 score 9 scripts 4 dependents

carmonalab

scGate:Marker-Based Cell Type Purification for Single-Cell Sequencing Data

A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. 'scGate' automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. Briefly, 'scGate' takes as input: i) a gene expression matrix stored in a 'Seurat' object and ii) a “gating model” (GM), consisting of a set of marker genes that define the cell population of interest. The GM can be as simple as a single marker gene, or a combination of positive and negative markers. More complex GMs can be constructed in a hierarchical fashion, akin to gating strategies employed in flow cytometry. 'scGate' evaluates the strength of signature marker expression in each cell using the rank-based method 'UCell', and then performs k-nearest neighbor (kNN) smoothing by calculating the mean 'UCell' score across neighboring cells. kNN-smoothing aims at compensating for the large degree of sparsity in scRNA-seq data. Finally, a universal threshold over kNN-smoothed signature scores is applied in binary decision trees generated from the user-provided gating model, to annotate cells as either “pure” or “impure”, with respect to the cell population of interest. See the related publication Andreatta et al. (2022) <doi:10.1093/bioinformatics/btac141>.

Maintained by Massimo Andreatta. Last updated 2 months ago.

filtering marker-genes scgate signatures single-cell

106 stars 8.38 score 163 scripts

mucollective

multiverse:Create 'multiverse analysis' in R

Implement 'multiverse' style analyses (Steegen S., Tuerlinckx F, Gelman A., Vanpaemal, W., 2016) <doi:10.1177/1745691616658637> to show the robustness of statistical inference. 'Multiverse analysis' is a philosophy of statistical reporting where paper authors report the outcomes of many different statistical analyses in order to show how fragile or robust their findings are. The 'multiverse' package (Sarma A., Kale A., Moon M., Taback N., Chevalier F., Hullman J., Kay M., 2021) <doi:10.31219/osf.io/yfbwm> allows users to concisely and flexibly implement 'multiverse-style' analysis, which involve declaring alternate ways of performing an analysis step, in R and R Notebooks.

Maintained by Abhraneel Sarma. Last updated 4 months ago.

62 stars 8.37 score 42 scripts

cefet-rj-dal

harbinger:A Unified Time Series Event Detection Framework

By analyzing time series, it is possible to observe significant changes in the behavior of observations that frequently characterize events. Events present themselves as anomalies, change points, or motifs. In the literature, there are several methods for detecting events. However, searching for a suitable time series method is a complex task, especially considering that the nature of events is often unknown. This work presents Harbinger, a framework for integrating and analyzing event detection methods. Harbinger contains several state-of-the-art methods described in Salles et al. (2020) <doi:10.5753/sbbd.2020.13626>.

Maintained by Eduardo Ogasawara. Last updated 4 months ago.

18 stars 8.32 score 216 scripts

taylor-arnold

cleanNLP:A Tidy Data Model for Natural Language Processing

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, or a Python back ends with 'spaCy' <https://spacy.io>. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, and dependency parsing.

Maintained by Taylor B. Arnold. Last updated 11 months ago.

corenlp natural-language-processing spacy

215 stars 8.29 score 229 scripts

bioc

crisprDesign:Comprehensive design of CRISPR gRNAs for nucleases and base editors

Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.

Maintained by Jean-Philippe Fortin. Last updated 26 days ago.

crispr functionalgenomics genetarget bioconductor bioconductor-package crispr-cas9 crispr-design crispr-target genomics-analysis grna grna-sequence grna-sequences sgrna sgrna-design

22 stars 8.28 score 80 scripts 3 dependents

eblondel

zen4R:Interface to 'Zenodo' REST API

Provides an Interface to 'Zenodo' (<https://zenodo.org>) REST API, including management of depositions, attribution of DOIs by 'Zenodo' and upload and download of files.

Maintained by Emmanuel Blondel. Last updated 1 months ago.

api datacite depositions deposits doi fair zenodo

45 stars 8.25 score 76 scripts 1 dependents

ramikrispin

coronavirus:The 2019 Novel Coronavirus COVID-19 (2019-nCoV) Dataset

Provides a daily summary of the Coronavirus (COVID-19) cases by state/province. Data source: Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus <https://systems.jhu.edu/research/public-health/ncov/>.

Maintained by Rami Krispin. Last updated 2 years ago.

covid-19 covid19 covid19-data dataset

499 stars 8.25 score 716 scripts

mi2-warsaw

FSelectorRcpp:'Rcpp' Implementation of 'FSelector' Entropy-Based Feature Selection Algorithms with a Sparse Matrix Support

'Rcpp' (free of 'Java'/'Weka') implementation of 'FSelector' entropy-based feature selection algorithms based on an MDL discretization (Fayyad U. M., Irani K. B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In 13'th International Joint Conference on Uncertainly in Artificial Intelligence (IJCAI93), pages 1022-1029, Chambery, France, 1993.) <https://www.ijcai.org/Proceedings/93-2/Papers/022.pdf> with a sparse matrix support.

Maintained by Zygmunt Zawadzki. Last updated 6 months ago.

entropy feature-selection rcpp sparse-matrix cpp

35 stars 8.22 score 78 scripts 1 dependents

r-dbi

DBItest:Testing DBI Backends

A helper that tests DBI back ends for conformity to the interface.

Maintained by Kirill Müller. Last updated 14 days ago.

database testing

24 stars 8.21 score 11 scripts

mrc-ide

malariasimulation:An individual based model for malaria

Specifies the latest and greatest malaria model.

Maintained by Giovanni Charles. Last updated 1 months ago.

cpp

17 stars 8.19 score 146 scripts

safetygraphics

safetyGraphics:Interactive Graphics for Monitoring Clinical Trial Safety

A framework for evaluation of clinical trial safety. Users can interactively explore their data using the included 'Shiny' application.

Maintained by Jeremy Wildfire. Last updated 2 years ago.

99 stars 8.19 score 111 scripts

brockk

escalation:A Modular Approach to Dose-Finding Clinical Trials

Methods for working with dose-finding clinical trials. We provide implementations of many dose-finding clinical trial designs, including the continual reassessment method (CRM) by O'Quigley et al. (1990) <doi:10.2307/2531628>, the toxicity probability interval (TPI) design by Ji et al. (2007) <doi:10.1177/1740774507079442>, the modified TPI (mTPI) design by Ji et al. (2010) <doi:10.1177/1740774510382799>, the Bayesian optimal interval design (BOIN) by Liu & Yuan (2015) <doi:10.1111/rssc.12089>, EffTox by Thall & Cook (2004) <doi:10.1111/j.0006-341X.2004.00218.x>; the design of Wages & Tait (2015) <doi:10.1080/10543406.2014.920873>, and the 3+3 described by Korn et al. (1994) <doi:10.1002/sim.4780131802>. All designs are implemented with a common interface. We also offer optional additional classes to tailor the behaviour of all designs, including avoiding skipping doses, stopping after n patients have been treated at the recommended dose, stopping when a toxicity condition is met, or demanding that n patients are treated before stopping is allowed. By daisy-chaining together these classes using the pipe operator from 'magrittr', it is simple to tailor the behaviour of a dose-finding design so it behaves how the trialist wants. Having provided a flexible interface for specifying designs, we then provide functions to run simulations and calculate dose-paths for future cohorts of patients.

Maintained by Kristian Brock. Last updated 3 days ago.

15 stars 8.16 score 67 scripts

pecanproject

PEcAnAssimSequential:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Mike Dietze. Last updated 8 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.14 score 35 scripts

alinetalhouk

diceR:Diverse Cluster Ensemble in R

Performs cluster analysis using an ensemble clustering framework, Chiu & Talhouk (2018) <doi:10.1186/s12859-017-1996-y>. Results from a diverse set of algorithms are pooled together using methods such as majority voting, K-Modes, LinkCluE, and CSPA. There are options to compare cluster assignments across algorithms using internal and external indices, visualizations such as heatmaps, and significance testing for the existence of clusters.

Maintained by Derek Chiu. Last updated 2 months ago.

cpp

37 stars 8.13 score 60 scripts 3 dependents

nceas

metajam:Easily Download Data and Metadata from 'DataONE'

A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (<https://www.dataone.org>) and easily importing this information into R.

Maintained by Julien Brun. Last updated 7 months ago.

data data-analysis metadata repositories

16 stars 8.13 score 75 scripts

r-hyperspec

hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)

Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.

Maintained by Claudia Beleites. Last updated 10 months ago.

data-wrangling hyperspectral imaging infrared nmr raman spectroscopy uv-vis xrf

16 stars 8.10 score 233 scripts 2 dependents

ramiromagno

gwasrapidd:'REST' 'API' Client for the 'NHGRI'-'EBI' 'GWAS' Catalog

'GWAS' R 'API' Data Download. This package provides easy access to the 'NHGRI'-'EBI' 'GWAS' Catalog data by accessing the 'REST' 'API' <https://www.ebi.ac.uk/gwas/rest/docs/api/>.

Maintained by Ramiro Magno. Last updated 1 years ago.

thirdpartyclient biomedicalinformatics genomewideassociation snp association-studies gwas-catalog human rest-client trait trait-ontology

95 stars 8.10 score 49 scripts 1 dependents

moodymudskipper

boomer:Debugging Tools to Inspect the Intermediate Steps of a Call

Provides debugging tools that let you inspect the intermediate results of a call. The output looks as if we explode a call into its parts hence the package name.

Maintained by Antoine Fabri. Last updated 4 days ago.

138 stars 8.09 score 21 scripts

gfellerlab

SuperCell:Simplification of scRNA-seq data by merging together similar cells

Aggregates large single-cell data into metacell dataset by merging together gene expression of very similar cells.

Maintained by The package maintainer. Last updated 9 months ago.

software coarse-graining scrna-seq-analysis scrna-seq-data

72 stars 8.08 score 93 scripts

mrc-ide

dust:Iterate Multiple Realisations of Stochastic Models

An Engine for simulation of stochastic models. Includes support for running stochastic models in parallel, either with shared or varying parameters. Simulations are run efficiently in compiled code and can be run with a fraction of simulated states returned to R, allowing control over memory usage. Support is provided for building bootstrap particle filter for performing Sequential Monte Carlo (e.g., Gordon et al. 1993 <doi:10.1049/ip-f-2.1993.0015>). The core of the simulation engine is the 'xoshiro256**' algorithm (Blackman and Vigna <arXiv:1805.01407>), and the package is further described in FitzJohn et al 2021 <doi:10.12688/wellcomeopenres.16466.2>.

Maintained by Rich FitzJohn. Last updated 11 days ago.

cpp openmp

18 stars 8.07 score 60 scripts 3 dependents

bioc

velociraptor:Toolkit for Single-Cell Velocity

This package provides Bioconductor-friendly wrappers for RNA velocity calculations in single-cell RNA-seq data. We use the basilisk package to manage Conda environments, and the zellkonverter package to convert data structures between SingleCellExperiment (R) and AnnData (Python). The information produced by the velocity methods is stored in the various components of the SingleCellExperiment class.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

singlecell geneexpression sequencing coverage rna-velocity

55 stars 8.06 score 52 scripts

bioc

FLAMES:FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data

Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.

Maintained by Changqing Wang. Last updated 9 hours ago.

rnaseq singlecell transcriptomics dataimport differentialsplicing alternativesplicing geneexpression longread zlib curl bzip2 xz-utils cpp

33 stars 8.04 score 12 scripts

mazamascience

MazamaSpatialUtils:Spatial Data Download and Utility Functions

A suite of conversion functions to create internally standardized spatial polygons data frames. Utility functions use these data sets to return values such as country, state, time zone, watershed, etc. associated with a set of longitude/latitude pairs. (They also make cool maps.)

Maintained by Jonathan Callahan. Last updated 5 months ago.

5 stars 8.01 score 282 scripts 2 dependents

bioc

netZooR:Unified methods for the inference and analysis of gene regulatory networks

netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.

Maintained by Tara Eicher. Last updated 13 days ago.

networkinference network generegulation geneexpression transcription microarray graphandnetwork gene-regulatory-network transcription-factors

105 stars 7.98 score

scasanova

f1dataR:Access Formula 1 Data

Obtain Formula 1 data via the 'Jolpica API' <https://jolpi.ca> and the unofficial API <https://www.formula1.com/en/timing/f1-live> via the 'fastf1' 'Python' library <https://docs.fastf1.dev/>.

Maintained by Santiago Casanova. Last updated 4 days ago.

f1 formula1 sports-data

60 stars 7.97 score 26 scripts

dasonk

docstring:Provides Docstring Capabilities to R Functions

Provides the ability to display something analogous to Python's docstrings within R. By allowing the user to document their functions as comments at the beginning of their function without requiring putting the function into a package we allow more users to easily provide documentation for their functions. The documentation can be viewed just like any other help files for functions provided by packages as well.

Maintained by Dason Kurkiewicz. Last updated 3 years ago.

devtools docstring documentation documentation-tool roxygen-style

58 stars 7.96 score 305 scripts 1 dependents

rstudio

shinymeta:Export Domain Logic from Shiny using Meta-Programming

Provides tools for capturing logic in a Shiny app and exposing it as code that can be run outside of Shiny (e.g., from an R console). It also provides tools for bundling both the code and results to the end user.

Maintained by Carson Sievert. Last updated 11 months ago.

224 stars 7.94 score 62 scripts 7 dependents

bioc

scDD:Mixture modeling of single-cell RNA-seq data to identify genes with differential distributions

This package implements a method to analyze single-cell RNA- seq Data utilizing flexible Dirichlet Process mixture models. Genes with differential distributions of expression are classified into several interesting patterns of differences between two conditions. The package also includes functions for simulating data with these patterns from negative binomial distributions.

Maintained by Keegan Korthauer. Last updated 5 months ago.

immunooncology bayesian clustering rnaseq singlecell multiplecomparison visualization differentialexpression

33 stars 7.92 score 50 scripts

ropenspain

spanishoddata:Get Spanish Origin-Destination Data

Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the 'duckdb' database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.

Maintained by Egor Kotov. Last updated 10 days ago.

cdr data data-package mobile-telephone-data mobility origin-destination

35 stars 7.92 score 14 scripts

ocbe-uio

BayesMallows:Bayesian Preference Learning with the Mallows Rank Model

An implementation of the Bayesian version of the Mallows rank model (Vitelli et al., Journal of Machine Learning Research, 2018 <https://jmlr.org/papers/v18/15-481.html>; Crispino et al., Annals of Applied Statistics, 2019 <doi:10.1214/18-AOAS1203>; Sorensen et al., R Journal, 2020 <doi:10.32614/RJ-2020-026>; Stein, PhD Thesis, 2023 <https://eprints.lancs.ac.uk/id/eprint/195759>). Both Metropolis-Hastings and sequential Monte Carlo algorithms for estimating the models are available. Cayley, footrule, Hamming, Kendall, Spearman, and Ulam distances are supported in the models. The rank data to be analyzed can be in the form of complete rankings, top-k rankings, partially missing rankings, as well as consistent and inconsistent pairwise preferences. Several functions for plotting and studying the posterior distributions of parameters are provided. The package also provides functions for estimating the partition function (normalizing constant) of the Mallows rank model, both with the importance sampling algorithm of Vitelli et al. and asymptotic approximation with the IPFP algorithm (Mukherjee, Annals of Statistics, 2016 <doi:10.1214/15-AOS1389>).

Maintained by Oystein Sorensen. Last updated 2 months ago.

mallows-model openblas cpp openmp

21 stars 7.91 score 36 scripts 1 dependents

pik-piam

magpie4:MAgPIE outputs R package for MAgPIE version 4.x

Common output routines for extracting results from the MAgPIE framework (versions 4.x).

Maintained by Benjamin Leon Bodirsky. Last updated 3 hours ago.

2 stars 7.90 score 254 scripts 9 dependents

bioc

biocthis:Automate package and project setup for Bioconductor packages

This package expands the usethis package with the goal of helping automate the process of creating R packages for Bioconductor or making them Bioconductor-friendly.

Maintained by Leonardo Collado-Torres. Last updated 10 days ago.

software reportwriting actions bioconductor biocthis github styler usethis

51 stars 7.90 score 4 scripts 1 dependents

patriciamar

ShinyItemAnalysis:Test and Item Analysis via Shiny

Package including functions and interactive shiny application for the psychometric analysis of educational tests, psychological assessments, health-related and other types of multi-item measurements, or ratings from multiple raters.

Maintained by Patricia Martinkova. Last updated 12 days ago.

assessment differential-item-functioning item-analysis item-response-theory psychometrics shiny

45 stars 7.88 score 105 scripts 3 dependents

bioc

EBSeq:An R package for gene and isoform differential expression analysis of RNA-seq data

Differential Expression analysis at both gene and isoform level using RNA-seq data

Maintained by Xiuyu Ma. Last updated 10 days ago.

immunooncology statisticalmethod differentialexpression multiplecomparison rnaseq sequencing cpp

7.86 score 162 scripts 6 dependents

bioc

COTAN:COexpression Tables ANalysis

Statistical and computational method to analyze the co-expression of gene pairs at single cell level. It provides the foundation for single-cell gene interactome analysis. The basic idea is studying the zero UMI counts' distribution instead of focusing on positive counts; this is done with a generalized contingency tables framework. COTAN can effectively assess the correlated or anti-correlated expression of gene pairs. It provides a numerical index related to the correlation and an approximate p-value for the associated independence test. COTAN can also evaluate whether single genes are differentially expressed, scoring them with a newly defined global differentiation index. Moreover, this approach provides ways to plot and cluster genes according to their co-expression pattern with other genes, effectively helping the study of gene interactions and becoming a new tool to identify cell-identity marker genes.

Maintained by Galfrè Silvia Giulia. Last updated 1 months ago.

systemsbiology transcriptomics geneexpression singlecell

16 stars 7.85 score 96 scripts

bioc

biodb:biodb, a library and a development framework for connecting to chemical and biological databases

The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.

Maintained by Pierrick Roger. Last updated 5 months ago.

software infrastructure dataimport kegg biology cheminformatics chemistry databases cpp

11 stars 7.85 score 24 scripts 6 dependents

emilopezcano

SixSigma:Six Sigma Tools for Quality Control and Improvement

Functions and utilities to perform Statistical Analyses in the Six Sigma way. Through the DMAIC cycle (Define, Measure, Analyze, Improve, Control), you can manage several Quality Management studies: Gage R&R, Capability Analysis, Control Charts, Loss Function Analysis, etc. Data frames used in the books "Six Sigma with R" [ISBN 978-1-4614-3652-2] and "Quality Control with R" [ISBN 978-3-319-24046-6], are also included in the package.

Maintained by Emilio L. Cano. Last updated 2 years ago.

quality-control quality-improvement six-sigma spc

15 stars 7.82 score 169 scripts 1 dependents

matloff

dsld:Data Science Looks at Discrimination

Statistical and graphical tools for detecting and measuring discrimination and bias, be it racial, gender, age or other. Detection and remediation of bias in machine learning algorithms. 'Python' interfaces available.

Maintained by Norm Matloff. Last updated 2 months ago.

12 stars 7.81 score 35 scripts

epiverse-trace

simulist:Simulate Disease Outbreak Line List and Contacts Data

Tools to simulate realistic raw case data for an epidemic in the form of line lists and contacts using a branching process. Simulated outbreaks are parameterised with epidemiological parameters and can have age-structured populations, age-stratified hospitalisation and death risk and time-varying case fatality risk.

Maintained by Joshua W. Lambert. Last updated 5 days ago.

epidemiology epiverse linelist outbreaks

8 stars 7.79 score 27 scripts

bioc

PhyloProfile:PhyloProfile

PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.

Maintained by Vinh Tran. Last updated 10 days ago.

software visualization datarepresentation multiplecomparison functionalprediction dimensionreduction bioinformatics heatmap interactive-visualizations orthologs phylogenetic-profile shiny

33 stars 7.79 score 10 scripts

mazamascience

MazamaCoreUtils:Utility Functions for Production R Code

A suite of utility functions providing functionality commonly needed for production level projects such as logging, error handling, cache management and date-time parsing. Functions for date-time parsing and formatting require that time zones be specified explicitly, avoiding a common source of error when working with environmental time series.

Maintained by Jonathan Callahan. Last updated 4 months ago.

4 stars 7.76 score 119 scripts 5 dependents

ropensci

redland:RDF Library Bindings in R

Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at <https://www.w3.org/TR/rdf-primer/>. This package supports RDF by implementing an R interface to the Redland RDF C library, described at <https://librdf.org/docs/api/index.html>. In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.

Maintained by Matthew B. Jones. Last updated 1 years ago.

redland

16 stars 7.76 score 98 scripts 13 dependents

bioc

LACE:Longitudinal Analysis of Cancer Evolution (LACE)

LACE is an algorithmic framework that processes single-cell somatic mutation profiles from cancer samples collected at different time points and in distinct experimental settings, to produce longitudinal models of cancer evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a weighed likelihood function computed on multiple time points.

Maintained by Davide Maspero. Last updated 16 days ago.

biomedicalinformatics singlecell somaticmutation

15 stars 7.75 score 3 scripts

cmu-delphi

epidatr:Client for Delphi's 'Epidata' API

The Delphi 'Epidata' API provides real-time access to epidemiological surveillance data for influenza, 'COVID-19', and other diseases for the USA at various geographical resolutions, both from official government sources such as the Center for Disease Control (CDC) and Google Trends and private partners such as Facebook and Change 'Healthcare'. It is built and maintained by the Carnegie Mellon University Delphi research group. To cite this API: David C. Farrow, Logan C. Brooks, Aaron 'Rumack', Ryan J. 'Tibshirani', 'Roni' 'Rosenfeld' (2015). Delphi 'Epidata' API. <https://github.com/cmu-delphi/delphi-epidata>.

Maintained by David Weber. Last updated 11 days ago.

5 stars 7.71 score 114 scripts

ropensci

rdataretriever:R Interface to the Data Retriever

Provides an R interface to the Data Retriever <https://retriever.readthedocs.io/en/latest/> via the Data Retriever's command line interface. The Data Retriever automates the tasks of finding, downloading, and cleaning public datasets, and then stores them in a local database.

Maintained by Henry Senyondo. Last updated 8 months ago.

data data-science database datasets science

46 stars 7.70 score 36 scripts

carpentries

sandpaper:Create and Curate Carpentries Lessons

We provide tools to build a Carpentries-themed lesson repository into an accessible standalone static website. These include local tools and those designed to be used in a continuous integration context so that all the lesson author needs to focus on is writing the content of the actual lesson.

Maintained by Robert Davey. Last updated 2 months ago.

carpentries carpentries-infrastructure carpentries-workbench lesson-template lessons markdown static-site-generator

44 stars 7.68 score 8 scripts

theoreticalecology

sjSDM:Scalable Joint Species Distribution Modeling

A scalable and fast method for estimating joint Species Distribution Models (jSDMs) for big community data, including eDNA data. The package estimates a full (i.e. non-latent) jSDM with different response distributions (including the traditional multivariate probit model). The package allows to perform variation partitioning (VP) / ANOVA on the fitted models to separate the contribution of environmental, spatial, and biotic associations. In addition, the total R-squared can be further partitioned per species and site to reveal the internal metacommunity structure, see Leibold et al., <doi:10.1111/oik.08618>. The internal structure can then be regressed against environmental and spatial distinctiveness, richness, and traits to analyze metacommunity assembly processes. The package includes support for accounting for spatial autocorrelation and the option to fit responses using deep neural networks instead of a standard linear predictor. As described in Pichler & Hartig (2021) <doi:10.1111/2041-210X.13687>, scalability is achieved by using a Monte Carlo approximation of the joint likelihood implemented via 'PyTorch' and 'reticulate', which can be run on CPUs or GPUs.

Maintained by Maximilian Pichler. Last updated 1 months ago.

deep-learning gpu-acceleration machine-learning species-distribution-modelling species-interactions

69 stars 7.64 score 70 scripts

sciviews

SciViews:'SciViews' - Data Processing and Visualization with the 'SciViews::R' Dialect

The 'SciViews::R' dialect provides a set of functions that streamlines data input, process, analysis and visualization especially, but not exclusively, for beginners or occasional users. It mixes base R and tidyverse, plus another set of CRAN packages for an easy and coherent use of R.

Maintained by Philippe Grosjean. Last updated 7 months ago.

sciviews

8 stars 7.62 score 116 scripts 1 dependents

ropengov

retroharmonize:Ex Post Survey Data Harmonization

Assist in reproducible retrospective (ex-post) harmonization of data, particularly individual level survey data, by providing tools for organizing metadata, standardizing the coding of variables, and variable names and value labels, including missing values, and documenting the data transformations, with the help of comprehensive s3 classes.

Maintained by Daniel Antal. Last updated 2 months ago.

ropengov

10 stars 7.62 score 59 scripts

uligges

klaR:Classification and Visualization

Miscellaneous functions for classification and visualization, e.g. regularized discriminant analysis, sknn() kernel-density naive Bayes, an interface to 'svmlight' and stepclass() wrapper variable selection for supervised classification, partimat() visualization of classification rules and shardsplot() of cluster results as well as kmodes() clustering for categorical data, corclust() variable clustering, variable extraction from different variable clustering models and weight of evidence preprocessing.

Maintained by Uwe Ligges. Last updated 1 years ago.

5 stars 7.61 score 1.4k scripts 13 dependents

nandp1

nser:Bhavcopy and Live Market Data from National Stock Exchange (NSE) & Bombay Stock Exchange (BSE) India

Download Current & Historical Bhavcopy. Get Live Market data from NSE India of Equities and Derivatives (F&O) segment. Data source <https://www.nseindia.com/>.

Maintained by Nandan Patil. Last updated 5 months ago.

bhav bhavcopy bhavcopy-downloader financial-data market-data national-stock-exchange nse nse-stock-data option-pricing optionchain rselenium stock-prices

8 stars 7.61 score 76 scripts

bioc

rrvgo:Reduce + Visualize GO

Reduce and visualize lists of Gene Ontology terms by identifying redudance based on semantic similarity.

Maintained by Sergi Sayols. Last updated 5 months ago.

annotation clustering go network pathways software

26 stars 7.60 score 190 scripts

neurogenomics

rworkflows:Test, Document, Containerise, and Deploy R Packages

Reproducibility is essential to the progress of research, yet achieving it remains elusive even in computational fields. Continuous Integration (CI) platforms offer a powerful way to launch automated workflows to check and document code, but often require considerable time, effort, and technical expertise to setup. We therefore developed the rworkflows suite to make robust CI workflows easy and freely accessible to all R package developers. rworkflows consists of 1) a CRAN/Bioconductor-compatible R package template, 2) an R package to quickly implement a standardised workflow, and 3) a centrally maintained GitHub Action.

Maintained by Brian Schilder. Last updated 2 months ago.

software workflowmanagement bioconductor containers continuous-integration docker dockerhub github-actions reproducibility workflows

79 stars 7.60 score 6 scripts

bioc

scDesign3:A unified framework of realistic in silico data generation and statistical model inference for single-cell and spatial omics

We present a statistical simulator, scDesign3, to generate realistic single-cell and spatial omics data, including various cell states, experimental designs, and feature modalities, by learning interpretable parameters from real data. Using a unified probabilistic model for single-cell and spatial omics data, scDesign3 infers biologically meaningful parameters; assesses the goodness-of-fit of inferred cell clusters, trajectories, and spatial locations; and generates in silico negative and positive controls for benchmarking computational tools.

Maintained by Dongyuan Song. Last updated 29 days ago.

software singlecell sequencing geneexpression spatial

89 stars 7.59 score 25 scripts

bioc

ggsc:Visualizing Single Cell and Spatial Transcriptomics

Useful functions to visualize single cell and spatial data. It supports visualizing 'Seurat', 'SingleCellExperiment' and 'SpatialExperiment' objects through grammar of graphics syntax implemented in 'ggplot2'.

Maintained by Guangchuang Yu. Last updated 5 months ago.

dimensionreduction geneexpression singlecell software spatial transcriptomics visualization openblas cpp openmp

47 stars 7.59 score 18 scripts

gesistsa

oolong:Create Validation Tests for Automated Content Analysis

Intended to create standard human-in-the-loop validity tests for typical automated content analysis such as topic modeling and dictionary-based methods. This package offers a standard workflow with functions to prepare, administer and evaluate a human-in-the-loop validity test. This package provides functions for validating topic models using word intrusion, topic intrusion (Chang et al. 2009, <https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models>) and word set intrusion (Ying et al. 2021) <doi:10.1017/pan.2021.33> tests. This package also provides functions for generating gold-standard data which are useful for validating dictionary-based methods. The default settings of all generated tests match those suggested in Chang et al. (2009) and Song et al. (2020) <doi:10.1080/10584609.2020.1723752>.

Maintained by Chung-hong Chan. Last updated 1 months ago.

textanalysis topicmodeling validation

55 stars 7.58 score 23 scripts

ropensci

tic:Tasks Integrating Continuously: CI-Agnostic Workflow Definitions

Provides a way to describe common build and deployment workflows for R-based projects: packages, websites (e.g. blogdown, pkgdown), or data processing (e.g. research compendia). The recipe is described independent of the continuous integration tool used for processing the workflow (e.g. 'GitHub Actions' or 'Circle CI'). This package has been peer-reviewed by rOpenSci (v0.3.0.9004).

Maintained by Eli Miller. Last updated 2 months ago.

appveyor continuous-integration deployment githubactions travis-ci

155 stars 7.57 score 16 scripts

biogenies

tidysq:Tidy Processing and Analysis of Biological Sequences

A tidy approach to analysis of biological sequences. All processing and data-storage functions are heavily optimized to allow the fastest and most efficient data storage.

Maintained by Dominik Rafacz. Last updated 3 months ago.

bioconductor bioinformatics biological-sequences fasta s3 sequences tibble tidy tidyverse vctrs cpp

40 stars 7.56 score 38 scripts

wincowgerdev

OpenSpecy:Analyze, Process, Identify, and Share Raman and (FT)IR Spectra

Raman and (FT)IR spectral analysis tool for plastic particles and other environmental samples (Cowger et al. 2021, <doi:10.1021/acs.analchem.1c00123>). With read_any(), Open Specy provides a single function for reading individual, batch, or map spectral data files like .asp, .csv, .jdx, .spc, .spa, .0, and .zip. process_spec() simplifies processing spectra, including smoothing, baseline correction, range restriction and flattening, intensity conversions, wavenumber alignment, and min-max normalization. Spectra can be identified in batch using an onboard reference library (Cowger et al. 2020, <doi:10.1177/0003702820929064>) using match_spec(). A Shiny app is available via run_app() or online at <https://openanalysis.org/openspecy/>.

Maintained by Win Cowger. Last updated 1 months ago.

29 stars 7.55 score 22 scripts

ohdsi

CohortSymmetry:Sequence Symmetry Analysis Using the Observational Medical Outcomes Partnership Common Data Model

Calculating crude sequence ratio, adjusted sequence ratio and confidence intervals using data mapped to the Observational Medical Outcomes Partnership Common Data Model.

Maintained by Xihang Chen. Last updated 7 days ago.

1 stars 7.52 score 73 scripts

bioc

SimBu:Simulate Bulk RNA-seq Datasets from Single-Cell Datasets

SimBu can be used to simulate bulk RNA-seq datasets with known cell type fractions. You can either use your own single-cell study for the simulation or the sfaira database. Different pre-defined simulation scenarios exist, as are options to run custom simulations. Additionally, expression values can be adapted by adding an mRNA bias, which produces more biologically relevant simulations.

Maintained by Alexander Dietrich. Last updated 3 days ago.

software rnaseq singlecell

15 stars 7.50 score 29 scripts 1 dependents

myeomans

politeness:Detecting Politeness Features in Text

Detecting markers of politeness in English natural language. This package allows researchers to easily visualize and quantify politeness between groups of documents. This package combines prior research on the linguistic markers of politeness. We thank the Spencer Foundation, the Hewlett Foundation, and Harvard's Institute for Quantitative Social Science for support.

Maintained by Mike Yeomans. Last updated 2 months ago.

25 stars 7.49 score 41 scripts 1 dependents

hendersontrent

theft:Tools for Handling Extraction of Features from Time Series

Consolidates and calculates different sets of time-series features from multiple 'R' and 'Python' packages including 'Rcatch22' Henderson, T. (2021) <doi:10.5281/zenodo.5546815>, 'feasts' O'Hara-Wild, M., Hyndman, R., and Wang, E. (2021) <https://CRAN.R-project.org/package=feasts>, 'tsfeatures' Hyndman, R., Kang, Y., Montero-Manso, P., Talagala, T., Wang, E., Yang, Y., and O'Hara-Wild, M. (2020) <https://CRAN.R-project.org/package=tsfeatures>, 'tsfresh' Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr A.W. (2018) <doi:10.1016/j.neucom.2018.03.067>, 'TSFEL' Barandas, M., et al. (2020) <doi:10.1016/j.softx.2020.100456>, and 'Kats' Facebook Infrastructure Data Science (2021) <https://facebookresearch.github.io/Kats/>.

Maintained by Trent Henderson. Last updated 2 months ago.

data-visualisation data-visualization dimensionality-reduction machine-learning time-series

40 stars 7.48 score 50 scripts 1 dependents

sem-in-r

seminr:Building and Estimating Structural Equation Models

A powerful, easy to syntax for specifying and estimating complex Structural Equation Models. Models can be estimated using Partial Least Squares Path Modeling or Covariance-Based Structural Equation Modeling or covariance based Confirmatory Factor Analysis. Methods described in Ray, Danks, and Valdez (2021).

Maintained by Nicholas Patrick Danks. Last updated 3 years ago.

common-factors composites construct pls-models

62 stars 7.46 score 284 scripts

rstudio

tfhub:Interface to 'TensorFlow' Hub

'TensorFlow' Hub is a library for the publication, discovery, and consumption of reusable parts of machine learning models. A module is a self-contained piece of a 'TensorFlow' graph, along with its weights and assets, that can be reused across different tasks in a process known as transfer learning. Transfer learning train a model with a smaller dataset, improve generalization, and speed up training.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

29 stars 7.46 score 73 scripts 1 dependents

bioc

crisprScore:On-Target and Off-Target Scoring Algorithms for CRISPR gRNAs

Provides R wrappers of several on-target and off-target scoring methods for CRISPR guide RNAs (gRNAs). The following nucleases are supported: SpCas9, AsCas12a, enAsCas12a, and RfxCas13d (CasRx). The available on-target cutting efficiency scoring methods are RuleSet1, Azimuth, DeepHF, DeepCpf1, enPAM+GB, and CRISPRscan. Both the CFD and MIT scoring methods are available for off-target specificity prediction. The package also provides a Lindel-derived score to predict the probability of a gRNA to produce indels inducing a frameshift for the Cas9 nuclease. Note that DeepHF, DeepCpf1 and enPAM+GB are not available on Windows machines.

Maintained by Jean-Philippe Fortin. Last updated 5 months ago.

crispr functionalgenomics functionalprediction bioconductor bioconductor-package crispr-cas9 crispr-design crispr-target genomics grna grna-sequence grna-sequences scoring-algorithm sgrna sgrna-design

16 stars 7.44 score 19 scripts 4 dependents

bioc

MOSim:Multi-Omics Simulation (MOSim)

MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.

Maintained by Sonia Tarazona. Last updated 5 months ago.

software timecourse experimentaldesign rnaseq cpp

9 stars 7.42 score 11 scripts

koheiw

seededlda:Seeded Sequential LDA for Topic Modeling

Seeded Sequential LDA can classify sentences of texts into pre-define topics with a small number of seed words (Watanabe & Baturo, 2023) <doi:10.1177/08944393231178605>. Implements Seeded LDA (Lu et al., 2010) <doi:10.1109/ICDMW.2011.125> and Sequential LDA (Du et al., 2012) <doi:10.1007/s10115-011-0425-1> with the distributed LDA algorithm (Newman, et al., 2009) for parallel computing.

Maintained by Kohei Watanabe. Last updated 2 months ago.

semi-supervised-learning text-classification onetbb cpp

75 stars 7.38 score 177 scripts 1 dependents

bioc

cogena:co-expressed gene-set enrichment analysis

cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.

Maintained by Zhilong Jia. Last updated 5 months ago.

clustering genesetenrichment geneexpression visualization pathways kegg go microarray sequencing systemsbiology datarepresentation dataimport bioconductor bioinformatics

12 stars 7.36 score 32 scripts

hedgehogqa

hedgehog:Property-Based Testing

Hedgehog will eat all your bugs. 'Hedgehog' is a property-based testing package in the spirit of 'QuickCheck'. With 'Hedgehog', one can test properties of their programs against randomly generated input, providing far superior test coverage compared to unit testing. One of the key benefits of 'Hedgehog' is integrated shrinking of counterexamples, which allows one to quickly find the cause of bugs, given salient examples when incorrect behaviour occurs.

Maintained by Huw Campbell. Last updated 4 years ago.

56 stars 7.33 score 63 scripts 1 dependents

modeloriented

shapper:Wrapper of Python Library 'shap'

Provides SHAP explanations of machine learning models. In applied machine learning, there is a strong belief that we need to strike a balance between interpretability and accuracy. However, in field of the Interpretable Machine Learning, there are more and more new ideas for explaining black-box models. One of the best known method for local explanations is SHapley Additive exPlanations (SHAP) introduced by Lundberg, S., et al., (2016) <arXiv:1705.07874> The SHAP method is used to calculate influences of variables on the particular observation. This method is based on Shapley values, a technique used in game theory. The R package 'shapper' is a port of the Python library 'shap'.

Maintained by Szymon Maksymiuk. Last updated 2 years ago.

58 stars 7.31 score 59 scripts

nerler

JointAI:Joint Analysis and Imputation of Incomplete Data

Joint analysis and imputation of incomplete data in the Bayesian framework, using (generalized) linear (mixed) models and extensions there of, survival models, or joint models for longitudinal and survival data, as described in Erler, Rizopoulos and Lesaffre (2021) <doi:10.18637/jss.v100.i20>. Incomplete covariates, if present, are automatically imputed. The package performs some preprocessing of the data and creates a 'JAGS' model, which will then automatically be passed to 'JAGS' <https://mcmc-jags.sourceforge.io/> with the help of the package 'rjags'.

Maintained by Nicole S. Erler. Last updated 12 months ago.

bayesian generalized-linear-models glm glmm imputation imputations jags joint-analysis linear-mixed-models linear-regression-models mcmc-sample mcmc-sampling missing-data missing-values survival cpp

28 stars 7.30 score 59 scripts 1 dependents

hoxo-m

githubinstall:A Helpful Way to Install R Packages Hosted on GitHub

Provides an helpful way to install packages hosted on GitHub.

Maintained by Koji Makiyama. Last updated 7 years ago.

r-language

49 stars 7.29 score 177 scripts

yihui

Rd2roxygen:Convert Rd to 'Roxygen' Documentation

Functions to convert Rd to 'roxygen' documentation. It can parse an Rd file to a list, create the 'roxygen' documentation and update the original R script (e.g. the one containing the definition of the function) accordingly. This package also provides utilities that can help developers build packages using 'roxygen' more easily. The 'formatR' package can be used to reformat the R code in the examples sections so that the code will be more readable.

Maintained by Yihui Xie. Last updated 12 months ago.

roxygen-documentation

32 stars 7.26 score 82 scripts 1 dependents

soerenpannier

emdi:Estimating and Mapping Disaggregated Indicators

Functions that support estimating, assessing and mapping regional disaggregated indicators. So far, estimation methods comprise direct estimation, the model-based unit-level approach Empirical Best Prediction (see "Small area estimation of poverty indicators" by Molina and Rao (2010) <doi:10.1002/cjs.10051>), the area-level model (see "Estimates of income for small places: An application of James-Stein procedures to Census Data" by Fay and Herriot (1979) <doi:10.1080/01621459.1979.10482505>) and various extensions of it (adjusted variance estimation methods, log and arcsin transformation, spatial, robust and measurement error models), as well as their precision estimates. The assessment of the used model is supported by a summary and diagnostic plots. For a suitable presentation of estimates, map plots can be easily created. Furthermore, results can easily be exported to excel. For a detailed description of the package and the methods used see "The R Package emdi for Estimating and Mapping Regionally Disaggregated Indicators" by Kreutzmann et al. (2019) <doi:10.18637/jss.v091.i07> and the second package vignette "A Framework for Producing Small Area Estimates Based on Area-Level Models in R".

Maintained by Soeren Pannier. Last updated 1 years ago.

15 stars 7.26 score 45 scripts 1 dependents

wwiecek

baggr:Bayesian Aggregate Treatment Effects

Running and comparing meta-analyses of data with hierarchical Bayesian models in Stan, including convenience functions for formatting data, plotting and pooling measures specific to meta-analysis. This implements many models from Meager (2019) <doi:10.1257/app.20170299>.

Maintained by Witold Wiecek. Last updated 7 days ago.

bayesian-statistics meta-analysis quantile-regression stan treatment-effects cpp

49 stars 7.24 score 88 scripts

inbo

checklist:A Thorough and Strict Set of Checks for R Packages and Source Code

An opinionated set of rules for R packages and R source code projects.

Maintained by Thierry Onkelinx. Last updated 1 months ago.

checklist continuous-integration continuous-testing quality-assurance

19 stars 7.24 score 21 scripts 2 dependents

insightsengineering

tern.mmrm:Tables and Graphs for Mixed Models for Repeated Measures (MMRM)

Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see for example Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E>. This package provides an interface for fitting MMRM within the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023). It builds on 'mmrm' <https://cran.r-project.org/package=mmrm> by Sabanés Bové et al. (2023) for the actual MMRM computations.

Maintained by Joe Zhu. Last updated 6 months ago.

graphs listings statistical-engineering tables

6 stars 7.23 score 8 scripts 1 dependents

pik-piam

lucode2:Code Manipulation and Analysis Tools

A collection of tools which allow to manipulate and analyze code.

Maintained by Jan Philipp Dietrich. Last updated 10 days ago.

7.22 score 364 scripts 8 dependents

usaid-oha-si

glamr:SI Utilities Package

Provides a series of base functions useful to the GH OHA SI team. This includes project setup, pulling from DATIM, and key functions for working with the MSD.

Maintained by Aaron Chafetz. Last updated 6 months ago.

2 stars 7.20 score 1.3k scripts 1 dependents

bioc

CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems

The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.

Maintained by Lihua Julie Zhu. Last updated 21 days ago.

immunooncology generegulation sequencematching crispr

7.18 score 51 scripts 2 dependents