Showing 200 of total 1408 results (show query)
rstudio
reticulate:Interface to 'Python'
Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.
Maintained by Tomasz Kalinowski. Last updated 6 days ago.
1.7k stars 21.02 score 18k scripts 434 dependentsr-lib
testthat:Unit Testing for R
Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.
Maintained by Hadley Wickham. Last updated 1 months ago.
900 stars 20.99 score 74k scripts 471 dependentsr-lib
here:A Simpler Way to Find Your Files
Constructs paths to your project's files. Declare the relative path of a file within your project with 'i_am()'. Use the 'here()' function as a drop-in replacement for 'file.path()', it will always locate the files relative to your project root.
Maintained by Kirill Müller. Last updated 15 days ago.
417 stars 19.62 score 96k scripts 607 dependentsr-lib
devtools:Tools to Make Developing R Packages Easier
Collection of package development tools.
Maintained by Jennifer Bryan. Last updated 6 months ago.
2.4k stars 19.55 score 51k scripts 150 dependentsr-lib
roxygen2:In-Line Documentation for R
Generate your Rd documentation, 'NAMESPACE' file, and collation field using specially formatted comments. Writing documentation in-line with code makes it easier to keep your documentation up-to-date as your requirements change. 'roxygen2' is inspired by the 'Doxygen' system for C++.
Maintained by Hadley Wickham. Last updated 8 months ago.
606 stars 18.51 score 2.3k scripts 219 dependentsr-lib
usethis:Automate Package and Project Setup
Automate package and project setup tasks that are otherwise performed manually. This includes setting up unit testing, test coverage, continuous integration, Git, 'GitHub', licenses, 'Rcpp', 'RStudio' projects, and more.
Maintained by Jennifer Bryan. Last updated 26 days ago.
869 stars 17.54 score 5.6k scripts 336 dependentssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
2.4k stars 16.86 score 50k scripts 73 dependentsr-lib
styler:Non-Invasive Pretty Printing of R Code
Pretty-prints R code without changing the user's formatting intent.
Maintained by Lorenz Walthert. Last updated 2 months ago.
754 stars 16.15 score 940 scripts 62 dependentsrstudio
tensorflow:R Interface to 'TensorFlow'
Interface to 'TensorFlow' <https://www.tensorflow.org/>, an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more 'CPUs' or 'GPUs' in a desktop, server, or mobile device with a single 'API'. 'TensorFlow' was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.
Maintained by Tomasz Kalinowski. Last updated 5 days ago.
1.3k stars 15.47 score 3.2k scripts 75 dependentsphilchalmers
mirt:Multidimensional Item Response Theory
Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.
Maintained by Phil Chalmers. Last updated 4 days ago.
212 stars 14.93 score 2.5k scripts 40 dependentsrstudio
learnr:Interactive Tutorials for R
Create interactive tutorials using R Markdown. Use a combination of narrative, figures, videos, exercises, and quizzes to create self-paced tutorials for learning about R and R packages.
Maintained by Garrick Aden-Buie. Last updated 7 months ago.
interactivepythonrmarkdownshinysqlteachingtutorial
713 stars 14.79 score 6.5k scripts 27 dependentsthinkr-open
golem:A Framework for Robust Shiny Applications
An opinionated framework for building a production-ready 'Shiny' application. This package contains a series of tools for building a robust 'Shiny' application from start to finish.
Maintained by Colin Fay. Last updated 7 months ago.
golemversehacktoberfestshinyshiny-appsshiny-rshinyapps
921 stars 14.21 score 167 scripts 63 dependentsr-lib
pkgload:Simulate Package Installation and Attach
Simulates the process of installing a package and then attaching it. This is a key part of the 'devtools' package as it allows you to rapidly iterate while developing a package.
Maintained by Lionel Henry. Last updated 6 days ago.
59 stars 14.21 score 112 scripts 590 dependentsr-spatial
rgee:R Bindings for Calling the 'Earth Engine' API
Earth Engine <https://earthengine.google.com/> client library for R. All of the 'Earth Engine' API classes, modules, and functions are made available. Additional functions implemented include importing (exporting) of Earth Engine spatial objects, extraction of time series, interactive map display, assets management interface, and metadata display. See <https://r-spatial.github.io/rgee/> for further details.
Maintained by Cesar Aybar. Last updated 4 days ago.
earth-engineearthenginegoogle-earth-enginegoogleearthenginespatial-analysisspatial-data
717 stars 13.77 score 1.9k scripts 3 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 11 days ago.
845 stars 13.63 score 264 scripts 2 dependentsphilchalmers
SimDesign:Structure for Organizing Monte Carlo Simulation Designs
Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.
Maintained by Phil Chalmers. Last updated 3 days ago.
monte-carlo-simulationsimulationsimulation-framework
62 stars 13.41 score 253 scripts 47 dependentsoscarkjell
text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Maintained by Oscar Kjell. Last updated 9 days ago.
deep-learningmachine-learningnlptransformersopenjdk
145 stars 13.21 score 436 scripts 1 dependentsjuba
questionr:Functions to Make Surveys Processing Easier
Set of functions to make the processing and analysis of surveys easier : interactive shiny apps and addins for data recoding, contingency tables, dataset metadata handling, and several convenience functions.
Maintained by Julien Barnier. Last updated 10 days ago.
83 stars 12.93 score 1.1k scripts 19 dependentstkonopka
umap:Uniform Manifold Approximation and Projection
Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).
Maintained by Tomasz Konopka. Last updated 11 months ago.
dimensionality-reductionumapcpp
132 stars 12.82 score 3.6k scripts 45 dependentsinsightsengineering
teal:Exploratory Web Apps for Analyzing Clinical Trials Data
A 'shiny' based interactive exploration framework for analyzing clinical trials data. 'teal' currently provides a dynamic filtering facility and different data viewers. 'teal' 'shiny' applications are built using standard 'shiny' modules.
Maintained by Dawid Kaledkowski. Last updated 1 months ago.
clinical-trialsnestshinywebapp
206 stars 12.65 score 176 scripts 5 dependentsgreta-dev
greta:Simple and Scalable Statistical Modelling in R
Write statistical models in R and fit them by MCMC and optimisation on CPUs and GPUs, using Google 'TensorFlow'. greta lets you write your own model like in BUGS, JAGS and Stan, except that you write models right in R, it scales well to massive datasets, and it’s easy to extend and build on. See the website for more information, including tutorials, examples, package documentation, and the greta forum.
Maintained by Nicholas Tierney. Last updated 20 days ago.
566 stars 12.53 score 396 scripts 6 dependentsr-lib
rcmdcheck:Run 'R CMD check' from 'R' and Capture Results
Run 'R CMD check' from 'R' and capture the results of the individual checks. Supports running checks in the background, timeouts, pretty printing and comparing check results.
Maintained by Gábor Csárdi. Last updated 6 months ago.
116 stars 12.34 score 102 scripts 158 dependentsopenpharma
mmrm:Mixed Models for Repeated Measures
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.
Maintained by Daniel Sabanes Bove. Last updated 24 days ago.
138 stars 12.15 score 113 scripts 4 dependentsrstudio
shinytest2:Testing for Shiny Applications
Automated unit testing of Shiny applications through a headless 'Chromium' browser.
Maintained by Barret Schloerke. Last updated 5 days ago.
108 stars 12.13 score 704 scripts 1 dependentsrstudio
tfruns:Training Run Tools for 'TensorFlow'
Create and manage unique directories for each 'TensorFlow' training run. Provides a unique, time stamped directory for each run along with functions to retrieve the directory of the latest run or latest several runs.
Maintained by Tomasz Kalinowski. Last updated 12 months ago.
34 stars 11.80 score 325 scripts 77 dependentskevinushey
sourcetools:Tools for Reading, Tokenizing and Parsing R Code
Tools for Reading, Tokenizing and Parsing R Code.
Maintained by Kevin Ushey. Last updated 2 years ago.
78 stars 11.77 score 32 scripts 1.8k dependentsrstudio
sortable:Drag-and-Drop in 'shiny' Apps with 'SortableJS'
Enables drag-and-drop behaviour in Shiny apps, by exposing the functionality of the 'SortableJS' <https://sortablejs.github.io/Sortable/> JavaScript library as an 'htmlwidget'. You can use this in Shiny apps and widgets, 'learnr' tutorials as well as R Markdown. In addition, provides a custom 'learnr' question type - 'question_rank()' - that allows ranking questions with drag-and-drop.
Maintained by Andrie de Vries. Last updated 7 months ago.
135 stars 11.62 score 368 scripts 13 dependentsr-lib
mockery:Mocking Library for R
The two main functionalities of this package are creating mock objects (functions) and selectively intercepting calls to a given function that originate in some other function. It can be used with any testing framework available for R. Mock objects can be injected with either this package's own stub() function or a similar with_mock() facility present in the 'testthat' package.
Maintained by Hadley Wickham. Last updated 1 years ago.
100 stars 11.57 score 504 scripts 5 dependentstylermorganwall
rayshader:Create Maps and Visualize Data in 2D and 3D
Uses a combination of raytracing and multiple hill shading methods to produce 2D and 3D data visualizations and maps. Includes water detection and layering functions, programmable color palette generation, several built-in textures for hill shading, 2D and 3D plotting options, a built-in path tracer, 'Wavefront' OBJ file export, and the ability to save 3D visualizations to a 3D printable format.
Maintained by Tyler Morgan-Wall. Last updated 2 months ago.
2.1k stars 11.55 score 1.5k scripts 5 dependentsworkflowr
workflowr:A Framework for Reproducible and Collaborative Data Science
Provides a workflow for your analysis projects by combining literate programming ('knitr' and 'rmarkdown') and version control ('Git', via 'git2r') to generate a website containing time-stamped, versioned, and documented results.
Maintained by John Blischak. Last updated 4 months ago.
gitproject-managementrmarkdownwebsiteworkflow
848 stars 11.53 score 566 scriptsr-hub
rhub:Tools for R Package Developers
R-hub v2 uses GitHub Actions to run 'R CMD check' and similar package checks. The 'rhub' package helps you set up R-hub v2 for your R package, and start running checks.
Maintained by Gábor Csárdi. Last updated 24 days ago.
359 stars 11.33 score 191 scripts 1 dependentsbioc
zellkonverter:Conversion Between scRNA-seq Objects
Provides methods to convert between Python AnnData objects and SingleCellExperiment objects. These are primarily intended for use by downstream Bioconductor packages that wrap Python methods for single-cell data analysis. It also includes functions to read and write H5AD files used for saving AnnData objects to disk.
Maintained by Luke Zappia. Last updated 22 days ago.
singlecelldataimportdatarepresentationbioconductorconversionscrna-seq
159 stars 11.25 score 660 scripts 4 dependentsropengov
eurostat:Tools for Eurostat Open Data
Tools to download data from the Eurostat database <https://ec.europa.eu/eurostat> together with search and manipulation utilities.
Maintained by Leo Lahti. Last updated 1 months ago.
242 stars 11.07 score 892 scripts 4 dependentspik-piam
madrat:May All Data be Reproducible and Transparent (MADRaT) *
Provides a framework which should improve reproducibility and transparency in data processing. It provides functionality such as automatic meta data creation and management, rudimentary quality management, data caching, work-flow management and data aggregation. * The title is a wish not a promise. By no means we expect this package to deliver everything what is needed to achieve full reproducibility and transparency, but we believe that it supports efforts in this direction.
Maintained by Jan Philipp Dietrich. Last updated 11 days ago.
15 stars 11.03 score 83 scripts 38 dependentst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
10.93 score 10k scripts 55 dependentsbioc
infercnv:Infer Copy Number Variation from Single-Cell RNA-Seq Data
Using single-cell RNA-Seq expression to visualize CNV in cells.
Maintained by Christophe Georgescu. Last updated 5 months ago.
softwarecopynumbervariationvariantdetectionstructuralvariationgenomicvariationgeneticstranscriptomicsstatisticalmethodbayesianhiddenmarkovmodelsinglecelljagscpp
601 stars 10.92 score 674 scriptstylermorganwall
rayrender:Build and Raytrace 3D Scenes
Render scenes using pathtracing. Build 3D scenes out of spheres, cubes, planes, disks, triangles, cones, curves, line segments, cylinders, ellipsoids, and 3D models in the 'Wavefront' OBJ file format or the PLY Polygon File Format. Supports several material types, textures, multicore rendering, and tone-mapping. Based on the "Ray Tracing in One Weekend" book series. Peter Shirley (2018) <https://raytracing.github.io>.
Maintained by Tyler Morgan-Wall. Last updated 4 days ago.
631 stars 10.87 score 188 scripts 8 dependentsmarce10
warbleR:Streamline Bioacoustic Analysis
Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
animal-acoustic-signalsaudio-processingbioacousticsspectrogramstreamline-analysiscpp
56 stars 10.86 score 270 scripts 4 dependentsfriendly
vcdExtra:'vcd' Extensions and Additions
Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.
Maintained by Michael Friendly. Last updated 7 days ago.
categorical-data-visualizationgeneralized-linear-modelsmosaic-plots
24 stars 10.85 score 472 scripts 3 dependentsr-lib
vdiffr:Visual Regression Testing and Graphical Diffing
An extension to the 'testthat' package that makes it easy to add graphical unit tests. It provides a Shiny application to manage the test cases.
Maintained by Lionel Henry. Last updated 5 months ago.
ggplot2graphicstestthatlibpngcpp
191 stars 10.84 score 254 scripts 5 dependentsmoodymudskipper
flow:View and Browse Code Using Flow Diagrams
Visualize as flow diagrams the logic of functions, expressions or scripts in a static way or when running a call, visualize the dependencies between functions or between modules in a shiny app, and more.
Maintained by Antoine Fabri. Last updated 4 months ago.
405 stars 10.84 score 61 scriptsrstudio
pointblank:Data Validation and Organization of Metadata for Local and Remote Tables
Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Maintained by Richard Iannone. Last updated 5 days ago.
data-assertionsdata-checkerdata-dictionariesdata-framesdata-inferencedata-managementdata-profilerdata-qualitydata-validationdata-verificationdatabase-tableseasy-to-understandreporting-toolschema-validationtesting-toolsyaml-configuration
942 stars 10.73 score 284 scriptsquanteda
spacyr:Wrapper to the 'spaCy' 'NLP' Library
An R wrapper to the 'Python' 'spaCy' 'NLP' library, from <https://spacy.io>.
Maintained by Kenneth Benoit. Last updated 2 months ago.
extract-entitiesnlpspacyspeech-tagging
253 stars 10.68 score 408 scripts 6 dependentscaseyyoungflesh
MCMCvis:Tools to Visualize, Manipulate, and Summarize MCMC Output
Performs key functions for MCMC analysis using minimal code - visualizes, manipulates, and summarizes MCMC output. Functions support simple and straightforward subsetting of model parameters within the calls, and produce presentable and 'publication-ready' output. MCMC output may be derived from Bayesian model output fit with Stan, NIMBLE, JAGS, and other software.
Maintained by Casey Youngflesh. Last updated 4 months ago.
38 stars 10.52 score 1.8k scripts 5 dependentsthinkr-open
attachment:Deal with Dependencies
Manage dependencies during package development. This can retrieve all dependencies that are used in ".R" files in the "R/" directory, in ".Rmd" files in "vignettes/" directory and in 'roxygen2' documentation of functions. There is a function to update the "DESCRIPTION" file of your package with 'CRAN' packages or any other remote package. All functions to retrieve dependencies of ".R" scripts and ".Rmd" or ".qmd" files can be used independently of a package development.
Maintained by Vincent Guyader. Last updated 17 days ago.
110 stars 10.44 score 48 scripts 5 dependentsropensci
goodpractice:Advice on R Package Building
Give advice about good practices when building R packages. Advice includes functions and syntax to avoid, package structure, code complexity, code formatting, etc.
Maintained by Mark Padgham. Last updated 4 months ago.
467 stars 10.32 score 79 scripts 2 dependentsrstudio
blastula:Easily Send HTML Email Messages
Compose and send out responsive HTML email messages that render perfectly across a range of email clients and device sizes. Helper functions let the user insert embedded images, web link buttons, and 'ggplot2' plot objects into the message body. Messages can be sent through an 'SMTP' server, through the 'Posit Connect' service, or through the 'Mailgun' API service <https://www.mailgun.com/>.
Maintained by Richard Iannone. Last updated 9 months ago.
easy-to-useemailhtmlmarkdownresponsive-emailsmtp
552 stars 10.27 score 348 scripts 5 dependentsfacebookexperimental
Robyn:Semi-Automated Marketing Mix Modeling (MMM) from Meta Marketing Science
Semi-Automated Marketing Mix Modeling (MMM) aiming to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocation and diminishing returns curves and allows ground-truth calibration to account for causation.
Maintained by Gufeng Zhou. Last updated 13 days ago.
adstockingbudget-allocationcost-response-curveeconometricsevolutionary-algorithmgradient-based-optimisationhyperparameter-optimizationmarketing-mix-modelingmarketing-mix-modellingmarketing-sciencemmmridge-regression
1.3k stars 10.27 score 95 scriptsinsightsengineering
teal.modules.clinical:'teal' Modules for Standard Clinical Outputs
Provides user-friendly tools for creating and customizing clinical trial reports. By leveraging the 'teal' framework, this package provides 'teal' modules to easily create an interactive panel that allows for seamless adjustments to data presentation, thereby streamlining the creation of detailed and accurate reports.
Maintained by Dawid Kaledkowski. Last updated 1 months ago.
clinical-trialsmodulesnestoutputsshiny
35 stars 10.21 score 149 scriptsbioc
singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data
The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.
Maintained by Joshua David Campbell. Last updated 1 months ago.
singlecellgeneexpressiondifferentialexpressionalignmentclusteringimmunooncologybatcheffectnormalizationqualitycontroldataimportgui
182 stars 10.17 score 252 scripts8-bit-sheep
googleAnalyticsR:Google Analytics API into R
Interact with the Google Analytics APIs <https://developers.google.com/analytics/>, including the Core Reporting API (v3 and v4), Management API, User Activity API GA4's Data API and Admin API and Multi-Channel Funnel API.
Maintained by Erik Grönroos. Last updated 7 months ago.
analyticsapigooglegoogleanalyticsrgoogleauthr
262 stars 10.11 score 680 scripts 1 dependentsconstantamateur
SoupX:Single Cell mRNA Soup eXterminator
Quantify, profile and remove ambient mRNA contamination (the "soup") from droplet based single cell RNA-seq experiments. Implements the method described in Young et al. (2018) <doi:10.1101/303727>.
Maintained by Matthew Daniel Young. Last updated 2 years ago.
266 stars 10.08 score 594 scripts 1 dependentsropensci
vcr:Record 'HTTP' Calls to Disk
Record test suite 'HTTP' requests and replays them during future runs. A port of the Ruby gem of the same name (<https://github.com/vcr/vcr/>). Works by hooking into the 'webmockr' R package for matching 'HTTP' requests by various rules ('HTTP' method, 'URL', query parameters, headers, body, etc.), and then caching real 'HTTP' responses on disk in 'cassettes'. Subsequent 'HTTP' requests matching any previous requests in the same 'cassette' use a cached 'HTTP' response.
Maintained by Scott Chamberlain. Last updated 27 days ago.
httphttpsapiweb-servicescurlmockmockinghttp-mockingtestingtesting-toolstddunit-testingvcr
77 stars 10.06 score 165 scriptsbioc
MOFA2:Multi-Omics Factor Analysis v2
The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.
Maintained by Ricard Argelaguet. Last updated 5 months ago.
dimensionreductionbayesianvisualizationfactor-analysismofamulti-omics
326 stars 10.03 score 502 scriptspecanproject
PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Istem Fer. Last updated 8 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.97 score 20 scripts 2 dependentsreditorsupport
languageserver:Language Server Protocol
An implementation of the Language Server Protocol for R. The Language Server protocol is used by an editor client to integrate features like auto completion. See <https://microsoft.github.io/language-server-protocol/> for details.
Maintained by Randy Lai. Last updated 1 years ago.
607 stars 9.93 score 207 scripts 1 dependentsdataoneorg
dataone:R Interface to the DataONE REST API
Provides read and write access to data and metadata from the DataONE network <https://www.dataone.org> of data repositories. Each DataONE repository implements a consistent repository application programming interface. Users call methods in R to access these remote repository functions, such as methods to query the metadata catalog, get access to metadata for particular data packages, and read the data objects from the data repository. Users can also insert and update data objects on repositories that support these methods.
Maintained by Matthew B. Jones. Last updated 3 years ago.
36 stars 9.93 score 472 scripts 3 dependentsr-lib
ymlthis:Write 'YAML' for 'R Markdown', 'bookdown', 'blogdown', and More
Write 'YAML' front matter for R Markdown and related documents. Work with 'YAML' objects more naturally and write the resulting 'YAML' to your clipboard or to 'YAML' files related to your project.
Maintained by Malcolm Barrett. Last updated 3 years ago.
165 stars 9.92 score 196 scripts 14 dependentsn8thangreen
BCEA:Bayesian Cost Effectiveness Analysis
Produces an economic evaluation of a sample of suitable variables of cost and effectiveness / utility for two or more interventions, e.g. from a Bayesian model in the form of MCMC simulations. This package computes the most cost-effective alternative and produces graphical summaries and probabilistic sensitivity analysis, see Baio et al (2017) <doi:10.1007/978-3-319-55718-2>.
Maintained by Gianluca Baio. Last updated 2 months ago.
3 stars 9.90 score 243 scripts 3 dependentsrstudio
distill:'R Markdown' Format for Scientific and Technical Writing
Scientific and technical article format for the web. 'Distill' articles feature attractive, reader-friendly typography, flexible layout options for visualizations, and full support for footnotes and citations.
Maintained by Christophe Dervieux. Last updated 1 years ago.
423 stars 9.85 score 402 scripts 6 dependentsstatdivlab
corncob:Count Regression for Correlated Observations with the Beta-Binomial
Statistical modeling for correlated count data using the beta-binomial distribution, described in Martin et al. (2020) <doi:10.1214/19-AOAS1283>. It allows for both mean and overdispersion covariates.
Maintained by Amy D Willis. Last updated 13 days ago.
106 stars 9.82 score 248 scripts 1 dependentscjvanlissa
worcs:Workflow for Open Reproducible Code in Science
Create reproducible and transparent research projects in 'R'. This package is based on the Workflow for Open Reproducible Code in Science (WORCS), a step-by-step procedure based on best practices for Open Science. It includes an 'RStudio' project template, several convenience functions, and all dependencies required to make your project reproducible and transparent. WORCS is explained in the tutorial paper by Van Lissa, Brandmaier, Brinkman, Lamprecht, Struiksma, & Vreede (2021). <doi:10.3233/DS-210031>.
Maintained by Caspar J. Van Lissa. Last updated 2 days ago.
83 stars 9.77 score 59 scripts 1 dependentsinsightsengineering
teal.modules.general:General Modules for 'teal' Applications
Prebuilt 'shiny' modules containing tools for viewing data, visualizing data, understanding missing and outlier values within your data and performing simple data analysis. This extends 'teal' framework that supports reproducible research and analysis.
Maintained by Dawid Kaledkowski. Last updated 1 months ago.
general-purposemodulesnestshiny
13 stars 9.74 score 71 scriptslorenzwalthert
precommit:Pre-Commit Hooks
Useful git hooks for R building on top of the multi-language framework 'pre-commit' for hook management. This package provides git hooks for common tasks like formatting files with 'styler' or spell checking as well as wrapper functions to access the 'pre-commit' executable.
Maintained by Lorenz Walthert. Last updated 2 days ago.
255 stars 9.73 score 10 scriptspecanproject
PEcAnRTM:PEcAn Functions Used for Radiative Transfer Modeling
Functions for performing forward runs and inversions of radiative transfer models (RTMs). Inversions can be performed using maximum likelihood, or more complex hierarchical Bayesian methods. Underlying numerical analyses are optimized for speed using Fortran code.
Maintained by Alexey Shiklomanov. Last updated 8 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsfortranjagscpp
216 stars 9.72 score 132 scriptsropensci
rdflib:Tools to Manipulate and Query Semantic Data
The Resource Description Framework, or 'RDF' is a widely used data representation model that forms the cornerstone of the Semantic Web. 'RDF' represents data as a graph rather than the familiar data table or rectangle of relational databases. The 'rdflib' package provides a friendly and concise user interface for performing common tasks on 'RDF' data, such as reading, writing and converting between the various serializations of 'RDF' data, including 'rdfxml', 'turtle', 'nquads', 'ntriples', and 'json-ld'; creating new 'RDF' graphs, and performing graph queries using 'SPARQL'. This package wraps the low level 'redland' R package which provides direct bindings to the 'redland' C library. Additionally, the package supports the newer and more developer friendly 'JSON-LD' format through the 'jsonld' package. The package interface takes inspiration from the Python 'rdflib' library.
Maintained by Carl Boettiger. Last updated 8 months ago.
57 stars 9.59 score 123 scripts 7 dependentsjamovi
jmv:The 'jamovi' Analyses
A suite of common statistical methods such as descriptives, t-tests, ANOVAs, regression, correlation matrices, proportion tests, contingency tables, and factor analysis. This package is also useable from the 'jamovi' statistical spreadsheet (see <https://www.jamovi.org> for more information).
Maintained by Jonathon Love. Last updated 29 days ago.
59 stars 9.58 score 440 scriptsndphillips
FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees
Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.
Maintained by Hansjoerg Neth. Last updated 5 months ago.
136 stars 9.53 score 144 scriptsstemangiola
tidyseurat:Brings Seurat to the Tidyverse
It creates an invisible layer that allow to see the 'Seurat' object as tibble and interact seamlessly with the tidyverse.
Maintained by Stefano Mangiola. Last updated 8 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsdplyrggplot2pcapurrrsctseuratsingle-cellsingle-cell-rna-seqtibbletidyrtidyversetranscriptstsneumap
159 stars 9.48 score 398 scripts 1 dependentsphilchalmers
mirtCAT:Computerized Adaptive Testing with Multidimensional Item Response Theory
Provides tools to generate HTML interfaces for adaptive and non-adaptive tests using the shiny package (Chalmers (2016) <doi:10.18637/jss.v071.i05>). Suitable for applying unidimensional and multidimensional computerized adaptive tests (CAT) using item response theory methodology and for creating simple questionnaires forms to collect response data directly in R. Additionally, optimal test designs (e.g., "shadow testing") are supported for tests that contain a large number of item selection constraints. Finally, package contains tools useful for performing Monte Carlo simulations for studying test item banks.
Maintained by Phil Chalmers. Last updated 5 months ago.
95 stars 9.47 score 62 scripts 3 dependentsdynverse
anndata:'anndata' for R
A 'reticulate' wrapper for the Python package 'anndata'. Provides a scalable way of keeping track of data and learned annotations. Used to read from and write to the h5ad file format.
Maintained by Robrecht Cannoodt. Last updated 26 days ago.
44 stars 9.47 score 772 scripts 3 dependentsnealrichardson
httptest:A Test Environment for HTTP Requests
Testing and documenting code that communicates with remote servers can be painful. Dealing with authentication, server state, and other complications can make testing seem too costly to bother with. But it doesn't need to be that hard. This package enables one to test all of the logic on the R sides of the API in your package without requiring access to the remote service. Importantly, it provides three contexts that mock the network connection in different ways, as well as testing functions to assert that HTTP requests were---or were not---made. It also allows one to safely record real API responses to use as test fixtures. The ability to save responses and load them offline also enables one to write vignettes and other dynamic documents that can be distributed without access to a live server.
Maintained by Neal Richardson. Last updated 1 years ago.
81 stars 9.46 score 276 scripts 1 dependentsthinkr-open
fusen:Build a Package from Rmarkdown Files
Use Rmarkdown First method to build your package. Start your package with documentation, functions, examples and tests in the same unique file. Everything can be set from the Rmarkdown template file provided in your project, then inflated as a package. Inflating the template copies the relevant chunks and sections in the appropriate files required for package development.
Maintained by Vincent Guyader. Last updated 2 months ago.
163 stars 9.45 score 35 scriptsextendr
rextendr:Call Rust Code from R using the 'extendr' Crate
Provides functions to compile and load Rust code from R, similar to how 'Rcpp' or 'cpp11' allow easy interfacing with C++ code. Also provides helper functions to create R packages that use Rust code. Under the hood, the Rust crate 'extendr' is used to do all the heavy lifting.
Maintained by Ilia Kosenkov. Last updated 4 days ago.
207 stars 9.45 score 61 scriptseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 12 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
118 stars 9.40 score 76 scriptsropensci
DataPackageR:Construct Reproducible Analytic Data Sets as R Packages
A framework to help construct R data packages in a reproducible manner. Potentially time consuming processing of raw data sets into analysis ready data sets is done in a reproducible manner and decoupled from the usual 'R CMD build' process so that data sets can be processed into R objects in the data package and the data package can then be shared, built, and installed by others without the need to repeat computationally costly data processing. The package maintains data provenance by turning the data processing scripts into package vignettes, as well as enforcing documentation and version checking of included data objects. Data packages can be version controlled on 'GitHub', and used to share data for manuscripts, collaboration and reproducible research.
Maintained by Dave Slager. Last updated 7 months ago.
156 stars 9.38 score 72 scriptsnealrichardson
httptest2:Test Helpers for 'httr2'
Testing and documenting code that communicates with remote servers can be painful. This package helps with writing tests for packages that use 'httr2'. It enables testing all of the logic on the R sides of the API without requiring access to the remote service, and it also allows recording real API responses to use as test fixtures. The ability to save responses and load them offline also enables writing vignettes and other dynamic documents that can be distributed without access to a live server.
Maintained by Neal Richardson. Last updated 9 months ago.
33 stars 9.37 score 95 scripts 1 dependentscbielow
PTXQC:Quality Report Generation for MaxQuant and mzTab Results
Generates Proteomics (PTX) quality control (QC) reports for shotgun LC-MS data analyzed with the MaxQuant software suite (from .txt files) or mzTab files (ideally from OpenMS 'QualityControl' tool). Reports are customizable (target thresholds, subsetting) and available in HTML or PDF format. Published in J. Proteome Res., Proteomics Quality Control: Quality Control Software for MaxQuant Results (2015) <doi:10.1021/acs.jproteome.5b00780>.
Maintained by Chris Bielow. Last updated 1 years ago.
drag-and-drophacktoberfestheatmapmatch-between-runsmaxquantmetricmztabopenmsproteomicsquality-controlquality-metricsreport
42 stars 9.35 score 105 scripts 1 dependentsrstudio
tfdatasets:Interface to 'TensorFlow' Datasets
Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/guide> for additional details.
Maintained by Tomasz Kalinowski. Last updated 19 days ago.
34 stars 9.32 score 656 scripts 3 dependentsinsightsengineering
teal.slice:Filter Module for 'teal' Applications
Data filtering module for 'teal' applications. Allows for interactive filtering of data stored in 'data.frame' and 'MultiAssayExperiment' objects. Also displays filtered and unfiltered observation counts.
Maintained by Dawid Kaledkowski. Last updated 2 months ago.
11 stars 9.27 score 3 scripts 6 dependentswhipson
maestro:Orchestration of Data Pipelines
Framework for creating and orchestrating data pipelines. Organize, orchestrate, and monitor multiple pipelines in a single project. Use tags to decorate functions with scheduling parameters and configuration.
Maintained by Will Hipson. Last updated 6 days ago.
119 stars 9.20 score 150 scriptsinsightsengineering
teal.widgets:'shiny' Widgets for 'teal' Applications
Collection of 'shiny' widgets to support 'teal' applications. Enables the manipulation of application layout and plot or table settings.
Maintained by Dawid Kaledkowski. Last updated 2 months ago.
5 stars 9.14 score 34 scripts 8 dependentsbodkan
slendr:A Simulation Framework for Spatiotemporal Population Genetics
A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.
Maintained by Martin Petr. Last updated 3 days ago.
popgenpopulation-geneticssimulationsspatial-statistics
56 stars 9.13 score 88 scriptsbioc
basilisk:Freezing Python Dependencies Inside Bioconductor Packages
Installs a self-contained conda instance that is managed by the R/Bioconductor installation machinery. This aims to provide a consistent Python environment that can be used reliably by Bioconductor packages. Functions are also provided to enable smooth interoperability of multiple Python environments in a single R session.
Maintained by Aaron Lun. Last updated 11 days ago.
9.12 score 75 scripts 39 dependentsstscl
gdverse:Analysis of Spatial Stratified Heterogeneity
Detecting spatial associations based on the concept of spatial stratified heterogeneity while also considering spatial dependencies, spatial interpretability, complex spatial interactions, and robust spatial stratification. In addition, it supports the spatial stratified heterogeneity family described in Lv et al. (2025)<doi:10.1111/tgis.70032>.
Maintained by Wenbo Lv. Last updated 2 days ago.
geographical-detectorgeoinformaticsgeospatial-analysisspatial-statisticsspatial-stratified-heterogeneitycpp
33 stars 9.10 score 41 scripts 2 dependentsdidiermurillof
FielDHub:A Shiny App for Design of Experiments in Life Sciences
A shiny design of experiments (DOE) app that aids in the creation of traditional, un-replicated, augmented and partially-replicated designs applied to agriculture, plant breeding, forestry, animal and biological sciences.
Maintained by Didier Murillo. Last updated 8 months ago.
agriculturalbreedingdesigndoeexperimentalplantbreedingshiny
47 stars 9.09 score 70 scripts 1 dependentsbioc
BatchQC:Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Maintained by Jessica Anderson. Last updated 13 days ago.
batcheffectgraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
7 stars 9.06 score 54 scriptspecanproject
PEcAn.all:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 8 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.02 score 266 scriptsrstudio
shinytest:Test Shiny Apps
Please see the shinytest to shinytest2 migration guide at <https://rstudio.github.io/shinytest2/articles/z-migration.html>.
Maintained by Winston Chang. Last updated 10 months ago.
225 stars 9.02 score 352 scriptsbioc
scPipe:Pipeline for single cell multi-omic data pre-processing
A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.
Maintained by Shian Su. Last updated 3 months ago.
immunooncologysoftwaresequencingrnaseqgeneexpressionsinglecellvisualizationsequencematchingpreprocessingqualitycontrolgenomeannotationdataimportcurlbzip2xz-utilszlibcpp
68 stars 9.02 score 84 scriptsr-spatial
link2GI:Linking Geographic Information Systems, Remote Sensing and Other Command Line Tools
Functions and tools for using open GIS and remote sensing command-line interfaces in a reproducible environment.
Maintained by Chris Reudenbach. Last updated 4 months ago.
26 stars 8.99 score 78 scripts 1 dependentsappsilon
rhino:A Framework for Enterprise Shiny Applications
A framework that supports creating and extending enterprise Shiny applications using best practices.
Maintained by Kamil Żyła. Last updated 3 days ago.
305 stars 8.99 score 145 scriptspharmar
riskmetric:Risk Metrics to Evaluating R Packages
Facilities for assessing R packages against a number of metrics to help quantify their robustness.
Maintained by Eli Miller. Last updated 7 days ago.
166 stars 8.98 score 43 scriptscjbarrie
academictwitteR:Access the Twitter Academic Research Product Track V2 API Endpoint
Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side.
Maintained by Christopher Barrie. Last updated 2 years ago.
275 stars 8.94 score 177 scriptsguangchuangyu
badger:Badge for R Package
Query information and generate badge for using in README and GitHub Pages.
Maintained by Guangchuang Yu. Last updated 9 months ago.
197 stars 8.92 score 225 scripts 5 dependentsazure
azuremlsdk:Interface to the 'Azure Machine Learning' 'SDK'
Interface to the 'Azure Machine Learning' Software Development Kit ('SDK'). Data scientists can use the 'SDK' to train, deploy, automate, and manage machine learning models on the 'Azure Machine Learning' service. To learn more about 'Azure Machine Learning' visit the website: <https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml>.
Maintained by Diondra Peck. Last updated 3 years ago.
amlcomputeazureazure-machine-learningazuremldsimachine-learningrstudiosdk-r
105 stars 8.91 score 221 scriptsajrgodfrey
BrailleR:Improved Access for Blind Users
Blind users do not have access to the graphical output from R without printing the content of graphics windows to an embosser of some kind. This is not as immediate as is required for efficient access to statistical output. The functions here are created so that blind people can make even better use of R. This includes the text descriptions of graphs, convenience functions to replace the functionality offered in many GUI front ends, and experimental functionality for optimising graphical content to prepare it for embossing as tactile images.
Maintained by A. Jonathan R. Godfrey. Last updated 12 months ago.
123 stars 8.90 score 143 scriptstomkellygenetics
leiden:R Implementation of Leiden Clustering Algorithm
Implements the 'Python leidenalg' module to be called in R. Enables clustering using the leiden algorithm for partition a graph into communities. See the 'Python' repository for more details: <https://github.com/vtraag/leidenalg> Traag et al (2018) From Louvain to Leiden: guaranteeing well-connected communities. <arXiv:1810.08473>.
Maintained by S. Thomas Kelly. Last updated 10 months ago.
38 stars 8.89 score 180 scripts 3 dependentspik-piam
remind2:The REMIND R package (2nd generation)
Contains the REMIND-specific routines for data and model output manipulation.
Maintained by Renato Rodrigues. Last updated 3 days ago.
8.87 score 161 scripts 5 dependentsdmi3kno
polite:Be Nice on the Web
Be responsible when scraping data from websites by following polite principles: introduce yourself, ask for permission, take slowly and never ask twice.
Maintained by Dmytro Perepolkin. Last updated 2 years ago.
crawlermemoiserate-limiterrobotstxtrvestscraperwebscraping
327 stars 8.86 score 596 scripts 5 dependentspecanproject
PEcAn.workflow:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides workhorse functions that can be used to run the major steps of a PEcAn analysis.
Maintained by David LeBauer. Last updated 8 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.85 score 15 scripts 4 dependentsrjournal
rjtools:Preparing, Checking, and Submitting Articles to the 'R Journal'
Create an 'R Journal' 'Rmarkdown' template article, that will generate html and pdf versions of your paper. Check that the paper folder has all the required components needed for submission. Examples of 'R Journal' publications can be found at <https://journal.r-project.org>.
Maintained by Di Cook. Last updated 2 months ago.
33 stars 8.81 score 37 scripts 1 dependentsropengov
regions:Processing Regional Statistics
Validating sub-national statistical typologies, re-coding across standard typologies of sub-national statistics, and making valid aggregate level imputation, re-aggregation, re-weighting and projection down to lower hierarchical levels to create meaningful data panels and time series.
Maintained by Daniel Antal. Last updated 3 years ago.
observatoryregionsropengovstatistics
12 stars 8.81 score 67 scripts 5 dependentsdvats
mcmcse:Monte Carlo Standard Errors for MCMC
Provides tools for computing Monte Carlo standard errors (MCSE) in Markov chain Monte Carlo (MCMC) settings. MCSE computation for expectation and quantile estimators is supported as well as multivariate estimations. The package also provides functions for computing effective sample size and for plotting Monte Carlo estimates versus sample size.
Maintained by Dootika Vats. Last updated 2 months ago.
effective-sample-sizemcmcoutput-aopenblascpp
12 stars 8.77 score 314 scripts 17 dependentspecanproject
PEcAn.data.remote:PEcAn Functions Used for Extracting Remote Sensing Data
PEcAn module for processing remote data. Python module requirements: requests, json, re, ast, panads, sys. If any of these modules are missing, install using pip install <module name>.
Maintained by Bailey Morrison. Last updated 8 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 8.77 score 6 scripts 5 dependentsinsightsengineering
rbmi:Reference Based Multiple Imputation
Implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). In particular, this package supports deterministic conditional mean imputation and jackknifing as described in Wolbers et al. (2022) <doi:10.1002/pst.2234>, Bayesian multiple imputation as described in Carpenter et al. (2013) <doi:10.1080/10543406.2013.834911>, and bootstrapped maximum likelihood imputation as described in von Hippel and Bartlett (2021) <doi: 10.1214/20-STS793>.
Maintained by Isaac Gravestock. Last updated 1 months ago.
18 stars 8.76 score 33 scripts 1 dependentscynkra
fledge:Smoother Change Tracking and Versioning for R Packages
Streamlines the process of updating changelogs (NEWS.md) and versioning R packages developed in git repositories.
Maintained by Kirill Müller. Last updated 3 months ago.
188 stars 8.73 score 10 scriptsbioc
memes:motif matching, comparison, and de novo discovery using the MEME Suite
A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.
Maintained by Spencer Nystrom. Last updated 5 months ago.
dataimportfunctionalgenomicsgeneregulationmotifannotationmotifdiscoverysequencematchingsoftware
50 stars 8.69 score 117 scripts 1 dependentsrstudio
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
54 stars 8.63 score 221 scripts 3 dependentst-kalinowski
tfautograph:Autograph R for 'Tensorflow'
Translate R control flow expressions into 'Tensorflow' graphs.
Maintained by Tomasz Kalinowski. Last updated 2 years ago.
18 stars 8.62 score 145 scripts 75 dependentsropensci
datapack:A Flexible Container to Transport and Manipulate Data and Associated Resources
Provides a flexible container to transport and manipulate complex sets of data. These data may consist of multiple data files and associated meta data and ancillary files. Individual data objects have associated system level meta data, and data files are linked together using the OAI-ORE standard resource map which describes the relationships between the files. The OAI- ORE standard is described at <https://www.openarchives.org/ore/>. Data packages can be serialized and transported as structured files that have been created following the BagIt specification. The BagIt specification is described at <https://tools.ietf.org/html/draft-kunze-bagit-08>.
Maintained by Matthew B. Jones. Last updated 3 years ago.
43 stars 8.55 score 195 scripts 4 dependentsppbds
tutorial.helpers:Helper Functions for Creating Tutorials
Helper functions for creating, editing, and testing tutorials created with the 'learnr' package. Provides a simple method for allowing students to download their answers to tutorial questions. For examples of its use, see the 'r4ds.tutorials' package.
Maintained by David Kane. Last updated 14 days ago.
5 stars 8.50 score 152 scripts 1 dependentsbioc
BgeeDB:Annotation and gene expression data retrieval from Bgee database. TopAnat, an anatomical entities Enrichment Analysis tool for UBERON ontology
A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.
Maintained by Julien Wollbrett. Last updated 5 months ago.
softwaredataimportsequencinggeneexpressionmicroarraygogenesetenrichmentbioinformaticsenrichment-analysisrna-seqscrna-seqsingle-cell
15 stars 8.46 score 19 scripts 1 dependentssamuel-marsh
scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing
Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.
Maintained by Samuel Marsh. Last updated 3 months ago.
customizationggplot2scrna-seqseuratsingle-cellsingle-cell-genomicssingle-cell-rna-seqvisualization
246 stars 8.45 score 1.1k scriptsbioc
lefser:R implementation of the LEfSE method for microbiome biomarker discovery
lefser is the R implementation of the popular microbiome biomarker discovery too, LEfSe. It uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers from two-level classes (and optional sub-classes).
Maintained by Sehyun Oh. Last updated 1 months ago.
softwaresequencingdifferentialexpressionmicrobiomestatisticalmethodclassificationbioconductor-packager01ca230551
56 stars 8.44 score 56 scriptsrstudio
tfestimators:Interface to 'TensorFlow' Estimators
Interface to 'TensorFlow' Estimators <https://www.tensorflow.org/guide/estimator>, a high-level API that provides implementations of many different model types including linear models and deep neural networks.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
57 stars 8.42 score 170 scriptsbioc
projectR:Functions for the projection of weights from PCA, CoGAPS, NMF, correlation, and clustering
Functions for the projection of data into the spaces defined by PCA, CoGAPS, NMF, correlation, and clustering.
Maintained by Genevieve Stein-OBrien. Last updated 13 days ago.
functionalpredictiongeneregulationbiologicalquestionsoftware
62 stars 8.42 score 70 scriptsinsightsengineering
teal.transform:Functions for Extracting and Merging Data in the 'teal' Framework
A standardized user interface for column selection, that facilitates dataset merging in 'teal' framework.
Maintained by Dawid Kaledkowski. Last updated 2 months ago.
3 stars 8.39 score 9 scripts 4 dependentscarmonalab
scGate:Marker-Based Cell Type Purification for Single-Cell Sequencing Data
A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. 'scGate' automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. Briefly, 'scGate' takes as input: i) a gene expression matrix stored in a 'Seurat' object and ii) a “gating model” (GM), consisting of a set of marker genes that define the cell population of interest. The GM can be as simple as a single marker gene, or a combination of positive and negative markers. More complex GMs can be constructed in a hierarchical fashion, akin to gating strategies employed in flow cytometry. 'scGate' evaluates the strength of signature marker expression in each cell using the rank-based method 'UCell', and then performs k-nearest neighbor (kNN) smoothing by calculating the mean 'UCell' score across neighboring cells. kNN-smoothing aims at compensating for the large degree of sparsity in scRNA-seq data. Finally, a universal threshold over kNN-smoothed signature scores is applied in binary decision trees generated from the user-provided gating model, to annotate cells as either “pure” or “impure”, with respect to the cell population of interest. See the related publication Andreatta et al. (2022) <doi:10.1093/bioinformatics/btac141>.
Maintained by Massimo Andreatta. Last updated 2 months ago.
filteringmarker-genesscgatesignaturessingle-cell
106 stars 8.38 score 163 scriptsmucollective
multiverse:Create 'multiverse analysis' in R
Implement 'multiverse' style analyses (Steegen S., Tuerlinckx F, Gelman A., Vanpaemal, W., 2016) <doi:10.1177/1745691616658637> to show the robustness of statistical inference. 'Multiverse analysis' is a philosophy of statistical reporting where paper authors report the outcomes of many different statistical analyses in order to show how fragile or robust their findings are. The 'multiverse' package (Sarma A., Kale A., Moon M., Taback N., Chevalier F., Hullman J., Kay M., 2021) <doi:10.31219/osf.io/yfbwm> allows users to concisely and flexibly implement 'multiverse-style' analysis, which involve declaring alternate ways of performing an analysis step, in R and R Notebooks.
Maintained by Abhraneel Sarma. Last updated 4 months ago.
62 stars 8.37 score 42 scriptscefet-rj-dal
harbinger:A Unified Time Series Event Detection Framework
By analyzing time series, it is possible to observe significant changes in the behavior of observations that frequently characterize events. Events present themselves as anomalies, change points, or motifs. In the literature, there are several methods for detecting events. However, searching for a suitable time series method is a complex task, especially considering that the nature of events is often unknown. This work presents Harbinger, a framework for integrating and analyzing event detection methods. Harbinger contains several state-of-the-art methods described in Salles et al. (2020) <doi:10.5753/sbbd.2020.13626>.
Maintained by Eduardo Ogasawara. Last updated 4 months ago.
18 stars 8.32 score 216 scriptstaylor-arnold
cleanNLP:A Tidy Data Model for Natural Language Processing
Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, or a Python back ends with 'spaCy' <https://spacy.io>. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, and dependency parsing.
Maintained by Taylor B. Arnold. Last updated 11 months ago.
corenlpnatural-language-processingspacy
215 stars 8.29 score 229 scriptsbioc
crisprDesign:Comprehensive design of CRISPR gRNAs for nucleases and base editors
Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.
Maintained by Jean-Philippe Fortin. Last updated 26 days ago.
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomics-analysisgrnagrna-sequencegrna-sequencessgrnasgrna-design
22 stars 8.28 score 80 scripts 3 dependentseblondel
zen4R:Interface to 'Zenodo' REST API
Provides an Interface to 'Zenodo' (<https://zenodo.org>) REST API, including management of depositions, attribution of DOIs by 'Zenodo' and upload and download of files.
Maintained by Emmanuel Blondel. Last updated 1 months ago.
apidatacitedepositionsdepositsdoifairzenodo
45 stars 8.25 score 76 scripts 1 dependentsramikrispin
coronavirus:The 2019 Novel Coronavirus COVID-19 (2019-nCoV) Dataset
Provides a daily summary of the Coronavirus (COVID-19) cases by state/province. Data source: Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus <https://systems.jhu.edu/research/public-health/ncov/>.
Maintained by Rami Krispin. Last updated 2 years ago.
covid-19covid19covid19-datadataset
499 stars 8.25 score 716 scriptsmi2-warsaw
FSelectorRcpp:'Rcpp' Implementation of 'FSelector' Entropy-Based Feature Selection Algorithms with a Sparse Matrix Support
'Rcpp' (free of 'Java'/'Weka') implementation of 'FSelector' entropy-based feature selection algorithms based on an MDL discretization (Fayyad U. M., Irani K. B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In 13'th International Joint Conference on Uncertainly in Artificial Intelligence (IJCAI93), pages 1022-1029, Chambery, France, 1993.) <https://www.ijcai.org/Proceedings/93-2/Papers/022.pdf> with a sparse matrix support.
Maintained by Zygmunt Zawadzki. Last updated 6 months ago.
entropyfeature-selectionrcppsparse-matrixcpp
35 stars 8.22 score 78 scripts 1 dependentsr-dbi
DBItest:Testing DBI Backends
A helper that tests DBI back ends for conformity to the interface.
Maintained by Kirill Müller. Last updated 14 days ago.
24 stars 8.21 score 11 scriptsmrc-ide
malariasimulation:An individual based model for malaria
Specifies the latest and greatest malaria model.
Maintained by Giovanni Charles. Last updated 1 months ago.
17 stars 8.19 score 146 scriptssafetygraphics
safetyGraphics:Interactive Graphics for Monitoring Clinical Trial Safety
A framework for evaluation of clinical trial safety. Users can interactively explore their data using the included 'Shiny' application.
Maintained by Jeremy Wildfire. Last updated 2 years ago.
99 stars 8.19 score 111 scriptspecanproject
PEcAnAssimSequential:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Mike Dietze. Last updated 8 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.14 score 35 scriptsalinetalhouk
diceR:Diverse Cluster Ensemble in R
Performs cluster analysis using an ensemble clustering framework, Chiu & Talhouk (2018) <doi:10.1186/s12859-017-1996-y>. Results from a diverse set of algorithms are pooled together using methods such as majority voting, K-Modes, LinkCluE, and CSPA. There are options to compare cluster assignments across algorithms using internal and external indices, visualizations such as heatmaps, and significance testing for the existence of clusters.
Maintained by Derek Chiu. Last updated 2 months ago.
37 stars 8.13 score 60 scripts 3 dependentsnceas
metajam:Easily Download Data and Metadata from 'DataONE'
A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (<https://www.dataone.org>) and easily importing this information into R.
Maintained by Julien Brun. Last updated 7 months ago.
datadata-analysismetadatarepositories
16 stars 8.13 score 75 scriptsr-hyperspec
hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)
Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
Maintained by Claudia Beleites. Last updated 10 months ago.
data-wranglinghyperspectralimaginginfrarednmrramanspectroscopyuv-visxrf
16 stars 8.10 score 233 scripts 2 dependentsramiromagno
gwasrapidd:'REST' 'API' Client for the 'NHGRI'-'EBI' 'GWAS' Catalog
'GWAS' R 'API' Data Download. This package provides easy access to the 'NHGRI'-'EBI' 'GWAS' Catalog data by accessing the 'REST' 'API' <https://www.ebi.ac.uk/gwas/rest/docs/api/>.
Maintained by Ramiro Magno. Last updated 1 years ago.
thirdpartyclientbiomedicalinformaticsgenomewideassociationsnpassociation-studiesgwas-cataloghumanrest-clienttraittrait-ontology
95 stars 8.10 score 49 scripts 1 dependentsmoodymudskipper
boomer:Debugging Tools to Inspect the Intermediate Steps of a Call
Provides debugging tools that let you inspect the intermediate results of a call. The output looks as if we explode a call into its parts hence the package name.
Maintained by Antoine Fabri. Last updated 4 days ago.
138 stars 8.09 score 21 scriptsgfellerlab
SuperCell:Simplification of scRNA-seq data by merging together similar cells
Aggregates large single-cell data into metacell dataset by merging together gene expression of very similar cells.
Maintained by The package maintainer. Last updated 9 months ago.
softwarecoarse-grainingscrna-seq-analysisscrna-seq-data
72 stars 8.08 score 93 scriptsmrc-ide
dust:Iterate Multiple Realisations of Stochastic Models
An Engine for simulation of stochastic models. Includes support for running stochastic models in parallel, either with shared or varying parameters. Simulations are run efficiently in compiled code and can be run with a fraction of simulated states returned to R, allowing control over memory usage. Support is provided for building bootstrap particle filter for performing Sequential Monte Carlo (e.g., Gordon et al. 1993 <doi:10.1049/ip-f-2.1993.0015>). The core of the simulation engine is the 'xoshiro256**' algorithm (Blackman and Vigna <arXiv:1805.01407>), and the package is further described in FitzJohn et al 2021 <doi:10.12688/wellcomeopenres.16466.2>.
Maintained by Rich FitzJohn. Last updated 11 days ago.
18 stars 8.07 score 60 scripts 3 dependentsbioc
velociraptor:Toolkit for Single-Cell Velocity
This package provides Bioconductor-friendly wrappers for RNA velocity calculations in single-cell RNA-seq data. We use the basilisk package to manage Conda environments, and the zellkonverter package to convert data structures between SingleCellExperiment (R) and AnnData (Python). The information produced by the velocity methods is stored in the various components of the SingleCellExperiment class.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
singlecellgeneexpressionsequencingcoveragerna-velocity
55 stars 8.06 score 52 scriptsbioc
FLAMES:FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data
Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.
Maintained by Changqing Wang. Last updated 9 hours ago.
rnaseqsinglecelltranscriptomicsdataimportdifferentialsplicingalternativesplicinggeneexpressionlongreadzlibcurlbzip2xz-utilscpp
33 stars 8.04 score 12 scriptsmazamascience
MazamaSpatialUtils:Spatial Data Download and Utility Functions
A suite of conversion functions to create internally standardized spatial polygons data frames. Utility functions use these data sets to return values such as country, state, time zone, watershed, etc. associated with a set of longitude/latitude pairs. (They also make cool maps.)
Maintained by Jonathan Callahan. Last updated 5 months ago.
5 stars 8.01 score 282 scripts 2 dependentsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 13 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
105 stars 7.98 scorescasanova
f1dataR:Access Formula 1 Data
Obtain Formula 1 data via the 'Jolpica API' <https://jolpi.ca> and the unofficial API <https://www.formula1.com/en/timing/f1-live> via the 'fastf1' 'Python' library <https://docs.fastf1.dev/>.
Maintained by Santiago Casanova. Last updated 4 days ago.
60 stars 7.97 score 26 scriptsdasonk
docstring:Provides Docstring Capabilities to R Functions
Provides the ability to display something analogous to Python's docstrings within R. By allowing the user to document their functions as comments at the beginning of their function without requiring putting the function into a package we allow more users to easily provide documentation for their functions. The documentation can be viewed just like any other help files for functions provided by packages as well.
Maintained by Dason Kurkiewicz. Last updated 3 years ago.
devtoolsdocstringdocumentationdocumentation-toolroxygen-style
58 stars 7.96 score 305 scripts 1 dependentsrstudio
shinymeta:Export Domain Logic from Shiny using Meta-Programming
Provides tools for capturing logic in a Shiny app and exposing it as code that can be run outside of Shiny (e.g., from an R console). It also provides tools for bundling both the code and results to the end user.
Maintained by Carson Sievert. Last updated 11 months ago.
224 stars 7.94 score 62 scripts 7 dependentsbioc
scDD:Mixture modeling of single-cell RNA-seq data to identify genes with differential distributions
This package implements a method to analyze single-cell RNA- seq Data utilizing flexible Dirichlet Process mixture models. Genes with differential distributions of expression are classified into several interesting patterns of differences between two conditions. The package also includes functions for simulating data with these patterns from negative binomial distributions.
Maintained by Keegan Korthauer. Last updated 5 months ago.
immunooncologybayesianclusteringrnaseqsinglecellmultiplecomparisonvisualizationdifferentialexpression
33 stars 7.92 score 50 scriptsropenspain
spanishoddata:Get Spanish Origin-Destination Data
Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the 'duckdb' database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.
Maintained by Egor Kotov. Last updated 10 days ago.
cdrdatadata-packagemobile-telephone-datamobilityorigin-destination
35 stars 7.92 score 14 scriptsocbe-uio
BayesMallows:Bayesian Preference Learning with the Mallows Rank Model
An implementation of the Bayesian version of the Mallows rank model (Vitelli et al., Journal of Machine Learning Research, 2018 <https://jmlr.org/papers/v18/15-481.html>; Crispino et al., Annals of Applied Statistics, 2019 <doi:10.1214/18-AOAS1203>; Sorensen et al., R Journal, 2020 <doi:10.32614/RJ-2020-026>; Stein, PhD Thesis, 2023 <https://eprints.lancs.ac.uk/id/eprint/195759>). Both Metropolis-Hastings and sequential Monte Carlo algorithms for estimating the models are available. Cayley, footrule, Hamming, Kendall, Spearman, and Ulam distances are supported in the models. The rank data to be analyzed can be in the form of complete rankings, top-k rankings, partially missing rankings, as well as consistent and inconsistent pairwise preferences. Several functions for plotting and studying the posterior distributions of parameters are provided. The package also provides functions for estimating the partition function (normalizing constant) of the Mallows rank model, both with the importance sampling algorithm of Vitelli et al. and asymptotic approximation with the IPFP algorithm (Mukherjee, Annals of Statistics, 2016 <doi:10.1214/15-AOS1389>).
Maintained by Oystein Sorensen. Last updated 2 months ago.
mallows-modelopenblascppopenmp
21 stars 7.91 score 36 scripts 1 dependentspik-piam
magpie4:MAgPIE outputs R package for MAgPIE version 4.x
Common output routines for extracting results from the MAgPIE framework (versions 4.x).
Maintained by Benjamin Leon Bodirsky. Last updated 3 hours ago.
2 stars 7.90 score 254 scripts 9 dependentsbioc
biocthis:Automate package and project setup for Bioconductor packages
This package expands the usethis package with the goal of helping automate the process of creating R packages for Bioconductor or making them Bioconductor-friendly.
Maintained by Leonardo Collado-Torres. Last updated 10 days ago.
softwarereportwritingactionsbioconductorbiocthisgithubstylerusethis
51 stars 7.90 score 4 scripts 1 dependentspatriciamar
ShinyItemAnalysis:Test and Item Analysis via Shiny
Package including functions and interactive shiny application for the psychometric analysis of educational tests, psychological assessments, health-related and other types of multi-item measurements, or ratings from multiple raters.
Maintained by Patricia Martinkova. Last updated 12 days ago.
assessmentdifferential-item-functioningitem-analysisitem-response-theorypsychometricsshiny
45 stars 7.88 score 105 scripts 3 dependentsbioc
EBSeq:An R package for gene and isoform differential expression analysis of RNA-seq data
Differential Expression analysis at both gene and isoform level using RNA-seq data
Maintained by Xiuyu Ma. Last updated 10 days ago.
immunooncologystatisticalmethoddifferentialexpressionmultiplecomparisonrnaseqsequencingcpp
7.86 score 162 scripts 6 dependentsbioc
COTAN:COexpression Tables ANalysis
Statistical and computational method to analyze the co-expression of gene pairs at single cell level. It provides the foundation for single-cell gene interactome analysis. The basic idea is studying the zero UMI counts' distribution instead of focusing on positive counts; this is done with a generalized contingency tables framework. COTAN can effectively assess the correlated or anti-correlated expression of gene pairs. It provides a numerical index related to the correlation and an approximate p-value for the associated independence test. COTAN can also evaluate whether single genes are differentially expressed, scoring them with a newly defined global differentiation index. Moreover, this approach provides ways to plot and cluster genes according to their co-expression pattern with other genes, effectively helping the study of gene interactions and becoming a new tool to identify cell-identity marker genes.
Maintained by Galfrè Silvia Giulia. Last updated 1 months ago.
systemsbiologytranscriptomicsgeneexpressionsinglecell
16 stars 7.85 score 96 scriptsbioc
biodb:biodb, a library and a development framework for connecting to chemical and biological databases
The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.
Maintained by Pierrick Roger. Last updated 5 months ago.
softwareinfrastructuredataimportkeggbiologycheminformaticschemistrydatabasescpp
11 stars 7.85 score 24 scripts 6 dependentsemilopezcano
SixSigma:Six Sigma Tools for Quality Control and Improvement
Functions and utilities to perform Statistical Analyses in the Six Sigma way. Through the DMAIC cycle (Define, Measure, Analyze, Improve, Control), you can manage several Quality Management studies: Gage R&R, Capability Analysis, Control Charts, Loss Function Analysis, etc. Data frames used in the books "Six Sigma with R" [ISBN 978-1-4614-3652-2] and "Quality Control with R" [ISBN 978-3-319-24046-6], are also included in the package.
Maintained by Emilio L. Cano. Last updated 2 years ago.
quality-controlquality-improvementsix-sigmaspc
15 stars 7.82 score 169 scripts 1 dependentsmatloff
dsld:Data Science Looks at Discrimination
Statistical and graphical tools for detecting and measuring discrimination and bias, be it racial, gender, age or other. Detection and remediation of bias in machine learning algorithms. 'Python' interfaces available.
Maintained by Norm Matloff. Last updated 2 months ago.
12 stars 7.81 score 35 scriptsepiverse-trace
simulist:Simulate Disease Outbreak Line List and Contacts Data
Tools to simulate realistic raw case data for an epidemic in the form of line lists and contacts using a branching process. Simulated outbreaks are parameterised with epidemiological parameters and can have age-structured populations, age-stratified hospitalisation and death risk and time-varying case fatality risk.
Maintained by Joshua W. Lambert. Last updated 5 days ago.
epidemiologyepiverselinelistoutbreaks
8 stars 7.79 score 27 scriptsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 10 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
33 stars 7.79 score 10 scriptsmazamascience
MazamaCoreUtils:Utility Functions for Production R Code
A suite of utility functions providing functionality commonly needed for production level projects such as logging, error handling, cache management and date-time parsing. Functions for date-time parsing and formatting require that time zones be specified explicitly, avoiding a common source of error when working with environmental time series.
Maintained by Jonathan Callahan. Last updated 4 months ago.
4 stars 7.76 score 119 scripts 5 dependentsropensci
redland:RDF Library Bindings in R
Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at <https://www.w3.org/TR/rdf-primer/>. This package supports RDF by implementing an R interface to the Redland RDF C library, described at <https://librdf.org/docs/api/index.html>. In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.
Maintained by Matthew B. Jones. Last updated 1 years ago.
16 stars 7.76 score 98 scripts 13 dependentsbioc
LACE:Longitudinal Analysis of Cancer Evolution (LACE)
LACE is an algorithmic framework that processes single-cell somatic mutation profiles from cancer samples collected at different time points and in distinct experimental settings, to produce longitudinal models of cancer evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a weighed likelihood function computed on multiple time points.
Maintained by Davide Maspero. Last updated 16 days ago.
biomedicalinformaticssinglecellsomaticmutation
15 stars 7.75 score 3 scriptscmu-delphi
epidatr:Client for Delphi's 'Epidata' API
The Delphi 'Epidata' API provides real-time access to epidemiological surveillance data for influenza, 'COVID-19', and other diseases for the USA at various geographical resolutions, both from official government sources such as the Center for Disease Control (CDC) and Google Trends and private partners such as Facebook and Change 'Healthcare'. It is built and maintained by the Carnegie Mellon University Delphi research group. To cite this API: David C. Farrow, Logan C. Brooks, Aaron 'Rumack', Ryan J. 'Tibshirani', 'Roni' 'Rosenfeld' (2015). Delphi 'Epidata' API. <https://github.com/cmu-delphi/delphi-epidata>.
Maintained by David Weber. Last updated 11 days ago.
5 stars 7.71 score 114 scriptsropensci
rdataretriever:R Interface to the Data Retriever
Provides an R interface to the Data Retriever <https://retriever.readthedocs.io/en/latest/> via the Data Retriever's command line interface. The Data Retriever automates the tasks of finding, downloading, and cleaning public datasets, and then stores them in a local database.
Maintained by Henry Senyondo. Last updated 8 months ago.
datadata-sciencedatabasedatasetsscience
46 stars 7.70 score 36 scriptscarpentries
sandpaper:Create and Curate Carpentries Lessons
We provide tools to build a Carpentries-themed lesson repository into an accessible standalone static website. These include local tools and those designed to be used in a continuous integration context so that all the lesson author needs to focus on is writing the content of the actual lesson.
Maintained by Robert Davey. Last updated 2 months ago.
carpentriescarpentries-infrastructurecarpentries-workbenchlesson-templatelessonsmarkdownstatic-site-generator
44 stars 7.68 score 8 scriptstheoreticalecology
sjSDM:Scalable Joint Species Distribution Modeling
A scalable and fast method for estimating joint Species Distribution Models (jSDMs) for big community data, including eDNA data. The package estimates a full (i.e. non-latent) jSDM with different response distributions (including the traditional multivariate probit model). The package allows to perform variation partitioning (VP) / ANOVA on the fitted models to separate the contribution of environmental, spatial, and biotic associations. In addition, the total R-squared can be further partitioned per species and site to reveal the internal metacommunity structure, see Leibold et al., <doi:10.1111/oik.08618>. The internal structure can then be regressed against environmental and spatial distinctiveness, richness, and traits to analyze metacommunity assembly processes. The package includes support for accounting for spatial autocorrelation and the option to fit responses using deep neural networks instead of a standard linear predictor. As described in Pichler & Hartig (2021) <doi:10.1111/2041-210X.13687>, scalability is achieved by using a Monte Carlo approximation of the joint likelihood implemented via 'PyTorch' and 'reticulate', which can be run on CPUs or GPUs.
Maintained by Maximilian Pichler. Last updated 1 months ago.
deep-learninggpu-accelerationmachine-learningspecies-distribution-modellingspecies-interactions
69 stars 7.64 score 70 scriptssciviews
SciViews:'SciViews' - Data Processing and Visualization with the 'SciViews::R' Dialect
The 'SciViews::R' dialect provides a set of functions that streamlines data input, process, analysis and visualization especially, but not exclusively, for beginners or occasional users. It mixes base R and tidyverse, plus another set of CRAN packages for an easy and coherent use of R.
Maintained by Philippe Grosjean. Last updated 7 months ago.
8 stars 7.62 score 116 scripts 1 dependentsropengov
retroharmonize:Ex Post Survey Data Harmonization
Assist in reproducible retrospective (ex-post) harmonization of data, particularly individual level survey data, by providing tools for organizing metadata, standardizing the coding of variables, and variable names and value labels, including missing values, and documenting the data transformations, with the help of comprehensive s3 classes.
Maintained by Daniel Antal. Last updated 2 months ago.
10 stars 7.62 score 59 scriptsuligges
klaR:Classification and Visualization
Miscellaneous functions for classification and visualization, e.g. regularized discriminant analysis, sknn() kernel-density naive Bayes, an interface to 'svmlight' and stepclass() wrapper variable selection for supervised classification, partimat() visualization of classification rules and shardsplot() of cluster results as well as kmodes() clustering for categorical data, corclust() variable clustering, variable extraction from different variable clustering models and weight of evidence preprocessing.
Maintained by Uwe Ligges. Last updated 1 years ago.
5 stars 7.61 score 1.4k scripts 13 dependentsnandp1
nser:Bhavcopy and Live Market Data from National Stock Exchange (NSE) & Bombay Stock Exchange (BSE) India
Download Current & Historical Bhavcopy. Get Live Market data from NSE India of Equities and Derivatives (F&O) segment. Data source <https://www.nseindia.com/>.
Maintained by Nandan Patil. Last updated 5 months ago.
bhavbhavcopybhavcopy-downloaderfinancial-datamarket-datanational-stock-exchangensense-stock-dataoption-pricingoptionchainrseleniumstock-prices
8 stars 7.61 score 76 scriptsbioc
rrvgo:Reduce + Visualize GO
Reduce and visualize lists of Gene Ontology terms by identifying redudance based on semantic similarity.
Maintained by Sergi Sayols. Last updated 5 months ago.
annotationclusteringgonetworkpathwayssoftware
26 stars 7.60 score 190 scriptsneurogenomics
rworkflows:Test, Document, Containerise, and Deploy R Packages
Reproducibility is essential to the progress of research, yet achieving it remains elusive even in computational fields. Continuous Integration (CI) platforms offer a powerful way to launch automated workflows to check and document code, but often require considerable time, effort, and technical expertise to setup. We therefore developed the rworkflows suite to make robust CI workflows easy and freely accessible to all R package developers. rworkflows consists of 1) a CRAN/Bioconductor-compatible R package template, 2) an R package to quickly implement a standardised workflow, and 3) a centrally maintained GitHub Action.
Maintained by Brian Schilder. Last updated 2 months ago.
softwareworkflowmanagementbioconductorcontainerscontinuous-integrationdockerdockerhubgithub-actionsreproducibilityworkflows
79 stars 7.60 score 6 scriptsbioc
scDesign3:A unified framework of realistic in silico data generation and statistical model inference for single-cell and spatial omics
We present a statistical simulator, scDesign3, to generate realistic single-cell and spatial omics data, including various cell states, experimental designs, and feature modalities, by learning interpretable parameters from real data. Using a unified probabilistic model for single-cell and spatial omics data, scDesign3 infers biologically meaningful parameters; assesses the goodness-of-fit of inferred cell clusters, trajectories, and spatial locations; and generates in silico negative and positive controls for benchmarking computational tools.
Maintained by Dongyuan Song. Last updated 29 days ago.
softwaresinglecellsequencinggeneexpressionspatial
89 stars 7.59 score 25 scriptsbioc
ggsc:Visualizing Single Cell and Spatial Transcriptomics
Useful functions to visualize single cell and spatial data. It supports visualizing 'Seurat', 'SingleCellExperiment' and 'SpatialExperiment' objects through grammar of graphics syntax implemented in 'ggplot2'.
Maintained by Guangchuang Yu. Last updated 5 months ago.
dimensionreductiongeneexpressionsinglecellsoftwarespatialtranscriptomicsvisualizationopenblascppopenmp
47 stars 7.59 score 18 scriptsgesistsa
oolong:Create Validation Tests for Automated Content Analysis
Intended to create standard human-in-the-loop validity tests for typical automated content analysis such as topic modeling and dictionary-based methods. This package offers a standard workflow with functions to prepare, administer and evaluate a human-in-the-loop validity test. This package provides functions for validating topic models using word intrusion, topic intrusion (Chang et al. 2009, <https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models>) and word set intrusion (Ying et al. 2021) <doi:10.1017/pan.2021.33> tests. This package also provides functions for generating gold-standard data which are useful for validating dictionary-based methods. The default settings of all generated tests match those suggested in Chang et al. (2009) and Song et al. (2020) <doi:10.1080/10584609.2020.1723752>.
Maintained by Chung-hong Chan. Last updated 1 months ago.
textanalysistopicmodelingvalidation
55 stars 7.58 score 23 scriptsropensci
tic:Tasks Integrating Continuously: CI-Agnostic Workflow Definitions
Provides a way to describe common build and deployment workflows for R-based projects: packages, websites (e.g. blogdown, pkgdown), or data processing (e.g. research compendia). The recipe is described independent of the continuous integration tool used for processing the workflow (e.g. 'GitHub Actions' or 'Circle CI'). This package has been peer-reviewed by rOpenSci (v0.3.0.9004).
Maintained by Eli Miller. Last updated 2 months ago.
appveyorcontinuous-integrationdeploymentgithubactionstravis-ci
155 stars 7.57 score 16 scriptsbiogenies
tidysq:Tidy Processing and Analysis of Biological Sequences
A tidy approach to analysis of biological sequences. All processing and data-storage functions are heavily optimized to allow the fastest and most efficient data storage.
Maintained by Dominik Rafacz. Last updated 3 months ago.
bioconductorbioinformaticsbiological-sequencesfastas3sequencestibbletidytidyversevctrscpp
40 stars 7.56 score 38 scriptswincowgerdev
OpenSpecy:Analyze, Process, Identify, and Share Raman and (FT)IR Spectra
Raman and (FT)IR spectral analysis tool for plastic particles and other environmental samples (Cowger et al. 2021, <doi:10.1021/acs.analchem.1c00123>). With read_any(), Open Specy provides a single function for reading individual, batch, or map spectral data files like .asp, .csv, .jdx, .spc, .spa, .0, and .zip. process_spec() simplifies processing spectra, including smoothing, baseline correction, range restriction and flattening, intensity conversions, wavenumber alignment, and min-max normalization. Spectra can be identified in batch using an onboard reference library (Cowger et al. 2020, <doi:10.1177/0003702820929064>) using match_spec(). A Shiny app is available via run_app() or online at <https://openanalysis.org/openspecy/>.
Maintained by Win Cowger. Last updated 1 months ago.
29 stars 7.55 score 22 scriptsohdsi
CohortSymmetry:Sequence Symmetry Analysis Using the Observational Medical Outcomes Partnership Common Data Model
Calculating crude sequence ratio, adjusted sequence ratio and confidence intervals using data mapped to the Observational Medical Outcomes Partnership Common Data Model.
Maintained by Xihang Chen. Last updated 7 days ago.
1 stars 7.52 score 73 scriptsbioc
SimBu:Simulate Bulk RNA-seq Datasets from Single-Cell Datasets
SimBu can be used to simulate bulk RNA-seq datasets with known cell type fractions. You can either use your own single-cell study for the simulation or the sfaira database. Different pre-defined simulation scenarios exist, as are options to run custom simulations. Additionally, expression values can be adapted by adding an mRNA bias, which produces more biologically relevant simulations.
Maintained by Alexander Dietrich. Last updated 3 days ago.
15 stars 7.50 score 29 scripts 1 dependentsmyeomans
politeness:Detecting Politeness Features in Text
Detecting markers of politeness in English natural language. This package allows researchers to easily visualize and quantify politeness between groups of documents. This package combines prior research on the linguistic markers of politeness. We thank the Spencer Foundation, the Hewlett Foundation, and Harvard's Institute for Quantitative Social Science for support.
Maintained by Mike Yeomans. Last updated 2 months ago.
25 stars 7.49 score 41 scripts 1 dependentshendersontrent
theft:Tools for Handling Extraction of Features from Time Series
Consolidates and calculates different sets of time-series features from multiple 'R' and 'Python' packages including 'Rcatch22' Henderson, T. (2021) <doi:10.5281/zenodo.5546815>, 'feasts' O'Hara-Wild, M., Hyndman, R., and Wang, E. (2021) <https://CRAN.R-project.org/package=feasts>, 'tsfeatures' Hyndman, R., Kang, Y., Montero-Manso, P., Talagala, T., Wang, E., Yang, Y., and O'Hara-Wild, M. (2020) <https://CRAN.R-project.org/package=tsfeatures>, 'tsfresh' Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr A.W. (2018) <doi:10.1016/j.neucom.2018.03.067>, 'TSFEL' Barandas, M., et al. (2020) <doi:10.1016/j.softx.2020.100456>, and 'Kats' Facebook Infrastructure Data Science (2021) <https://facebookresearch.github.io/Kats/>.
Maintained by Trent Henderson. Last updated 2 months ago.
data-visualisationdata-visualizationdimensionality-reductionmachine-learningtime-series
40 stars 7.48 score 50 scripts 1 dependentssem-in-r
seminr:Building and Estimating Structural Equation Models
A powerful, easy to syntax for specifying and estimating complex Structural Equation Models. Models can be estimated using Partial Least Squares Path Modeling or Covariance-Based Structural Equation Modeling or covariance based Confirmatory Factor Analysis. Methods described in Ray, Danks, and Valdez (2021).
Maintained by Nicholas Patrick Danks. Last updated 3 years ago.
common-factorscompositesconstructpls-models
62 stars 7.46 score 284 scriptsrstudio
tfhub:Interface to 'TensorFlow' Hub
'TensorFlow' Hub is a library for the publication, discovery, and consumption of reusable parts of machine learning models. A module is a self-contained piece of a 'TensorFlow' graph, along with its weights and assets, that can be reused across different tasks in a process known as transfer learning. Transfer learning train a model with a smaller dataset, improve generalization, and speed up training.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
29 stars 7.46 score 73 scripts 1 dependentsbioc
crisprScore:On-Target and Off-Target Scoring Algorithms for CRISPR gRNAs
Provides R wrappers of several on-target and off-target scoring methods for CRISPR guide RNAs (gRNAs). The following nucleases are supported: SpCas9, AsCas12a, enAsCas12a, and RfxCas13d (CasRx). The available on-target cutting efficiency scoring methods are RuleSet1, Azimuth, DeepHF, DeepCpf1, enPAM+GB, and CRISPRscan. Both the CFD and MIT scoring methods are available for off-target specificity prediction. The package also provides a Lindel-derived score to predict the probability of a gRNA to produce indels inducing a frameshift for the Cas9 nuclease. Note that DeepHF, DeepCpf1 and enPAM+GB are not available on Windows machines.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsfunctionalpredictionbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomicsgrnagrna-sequencegrna-sequencesscoring-algorithmsgrnasgrna-design
16 stars 7.44 score 19 scripts 4 dependentsbioc
MOSim:Multi-Omics Simulation (MOSim)
MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.
Maintained by Sonia Tarazona. Last updated 5 months ago.
softwaretimecourseexperimentaldesignrnaseqcpp
9 stars 7.42 score 11 scriptskoheiw
seededlda:Seeded Sequential LDA for Topic Modeling
Seeded Sequential LDA can classify sentences of texts into pre-define topics with a small number of seed words (Watanabe & Baturo, 2023) <doi:10.1177/08944393231178605>. Implements Seeded LDA (Lu et al., 2010) <doi:10.1109/ICDMW.2011.125> and Sequential LDA (Du et al., 2012) <doi:10.1007/s10115-011-0425-1> with the distributed LDA algorithm (Newman, et al., 2009) for parallel computing.
Maintained by Kohei Watanabe. Last updated 2 months ago.
semi-supervised-learningtext-classificationonetbbcpp
75 stars 7.38 score 177 scripts 1 dependentsbioc
cogena:co-expressed gene-set enrichment analysis
cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.
Maintained by Zhilong Jia. Last updated 5 months ago.
clusteringgenesetenrichmentgeneexpressionvisualizationpathwayskegggomicroarraysequencingsystemsbiologydatarepresentationdataimportbioconductorbioinformatics
12 stars 7.36 score 32 scriptshedgehogqa
hedgehog:Property-Based Testing
Hedgehog will eat all your bugs. 'Hedgehog' is a property-based testing package in the spirit of 'QuickCheck'. With 'Hedgehog', one can test properties of their programs against randomly generated input, providing far superior test coverage compared to unit testing. One of the key benefits of 'Hedgehog' is integrated shrinking of counterexamples, which allows one to quickly find the cause of bugs, given salient examples when incorrect behaviour occurs.
Maintained by Huw Campbell. Last updated 4 years ago.
56 stars 7.33 score 63 scripts 1 dependentsmodeloriented
shapper:Wrapper of Python Library 'shap'
Provides SHAP explanations of machine learning models. In applied machine learning, there is a strong belief that we need to strike a balance between interpretability and accuracy. However, in field of the Interpretable Machine Learning, there are more and more new ideas for explaining black-box models. One of the best known method for local explanations is SHapley Additive exPlanations (SHAP) introduced by Lundberg, S., et al., (2016) <arXiv:1705.07874> The SHAP method is used to calculate influences of variables on the particular observation. This method is based on Shapley values, a technique used in game theory. The R package 'shapper' is a port of the Python library 'shap'.
Maintained by Szymon Maksymiuk. Last updated 2 years ago.
58 stars 7.31 score 59 scriptsnerler
JointAI:Joint Analysis and Imputation of Incomplete Data
Joint analysis and imputation of incomplete data in the Bayesian framework, using (generalized) linear (mixed) models and extensions there of, survival models, or joint models for longitudinal and survival data, as described in Erler, Rizopoulos and Lesaffre (2021) <doi:10.18637/jss.v100.i20>. Incomplete covariates, if present, are automatically imputed. The package performs some preprocessing of the data and creates a 'JAGS' model, which will then automatically be passed to 'JAGS' <https://mcmc-jags.sourceforge.io/> with the help of the package 'rjags'.
Maintained by Nicole S. Erler. Last updated 12 months ago.
bayesiangeneralized-linear-modelsglmglmmimputationimputationsjagsjoint-analysislinear-mixed-modelslinear-regression-modelsmcmc-samplemcmc-samplingmissing-datamissing-valuessurvivalcpp
28 stars 7.30 score 59 scripts 1 dependentshoxo-m
githubinstall:A Helpful Way to Install R Packages Hosted on GitHub
Provides an helpful way to install packages hosted on GitHub.
Maintained by Koji Makiyama. Last updated 7 years ago.
49 stars 7.29 score 177 scriptsyihui
Rd2roxygen:Convert Rd to 'Roxygen' Documentation
Functions to convert Rd to 'roxygen' documentation. It can parse an Rd file to a list, create the 'roxygen' documentation and update the original R script (e.g. the one containing the definition of the function) accordingly. This package also provides utilities that can help developers build packages using 'roxygen' more easily. The 'formatR' package can be used to reformat the R code in the examples sections so that the code will be more readable.
Maintained by Yihui Xie. Last updated 12 months ago.
32 stars 7.26 score 82 scripts 1 dependentswwiecek
baggr:Bayesian Aggregate Treatment Effects
Running and comparing meta-analyses of data with hierarchical Bayesian models in Stan, including convenience functions for formatting data, plotting and pooling measures specific to meta-analysis. This implements many models from Meager (2019) <doi:10.1257/app.20170299>.
Maintained by Witold Wiecek. Last updated 7 days ago.
bayesian-statisticsmeta-analysisquantile-regressionstantreatment-effectscpp
49 stars 7.24 score 88 scriptsinbo
checklist:A Thorough and Strict Set of Checks for R Packages and Source Code
An opinionated set of rules for R packages and R source code projects.
Maintained by Thierry Onkelinx. Last updated 1 months ago.
checklistcontinuous-integrationcontinuous-testingquality-assurance
19 stars 7.24 score 21 scripts 2 dependentsinsightsengineering
tern.mmrm:Tables and Graphs for Mixed Models for Repeated Measures (MMRM)
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see for example Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E>. This package provides an interface for fitting MMRM within the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023). It builds on 'mmrm' <https://cran.r-project.org/package=mmrm> by Sabanés Bové et al. (2023) for the actual MMRM computations.
Maintained by Joe Zhu. Last updated 6 months ago.
graphslistingsstatistical-engineeringtables
6 stars 7.23 score 8 scripts 1 dependentspik-piam
lucode2:Code Manipulation and Analysis Tools
A collection of tools which allow to manipulate and analyze code.
Maintained by Jan Philipp Dietrich. Last updated 10 days ago.
7.22 score 364 scripts 8 dependentsusaid-oha-si
glamr:SI Utilities Package
Provides a series of base functions useful to the GH OHA SI team. This includes project setup, pulling from DATIM, and key functions for working with the MSD.
Maintained by Aaron Chafetz. Last updated 6 months ago.
2 stars 7.20 score 1.3k scripts 1 dependentsbioc
CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems
The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.
Maintained by Lihua Julie Zhu. Last updated 21 days ago.
immunooncologygeneregulationsequencematchingcrispr
7.18 score 51 scripts 2 dependents