Showing 200 of total 471 results (show query)
r-lib
devtools:Tools to Make Developing R Packages Easier
Collection of package development tools.
Maintained by Jennifer Bryan. Last updated 6 months ago.
2.4k stars 19.55 score 51k scripts 150 dependentsphilchalmers
mirt:Multidimensional Item Response Theory
Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.
Maintained by Phil Chalmers. Last updated 4 days ago.
212 stars 14.93 score 2.5k scripts 40 dependentsphilchalmers
SimDesign:Structure for Organizing Monte Carlo Simulation Designs
Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.
Maintained by Phil Chalmers. Last updated 3 days ago.
monte-carlo-simulationsimulationsimulation-framework
62 stars 13.41 score 253 scripts 47 dependentsopenpharma
mmrm:Mixed Models for Repeated Measures
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.
Maintained by Daniel Sabanes Bove. Last updated 24 days ago.
138 stars 12.15 score 113 scripts 4 dependentsrstudio
shinytest2:Testing for Shiny Applications
Automated unit testing of Shiny applications through a headless 'Chromium' browser.
Maintained by Barret Schloerke. Last updated 5 days ago.
108 stars 12.13 score 704 scripts 1 dependentskevinushey
sourcetools:Tools for Reading, Tokenizing and Parsing R Code
Tools for Reading, Tokenizing and Parsing R Code.
Maintained by Kevin Ushey. Last updated 2 years ago.
78 stars 11.77 score 32 scripts 1.8k dependentsr-lib
mockery:Mocking Library for R
The two main functionalities of this package are creating mock objects (functions) and selectively intercepting calls to a given function that originate in some other function. It can be used with any testing framework available for R. Mock objects can be injected with either this package's own stub() function or a similar with_mock() facility present in the 'testthat' package.
Maintained by Hadley Wickham. Last updated 1 years ago.
100 stars 11.57 score 504 scripts 5 dependentstylermorganwall
rayshader:Create Maps and Visualize Data in 2D and 3D
Uses a combination of raytracing and multiple hill shading methods to produce 2D and 3D data visualizations and maps. Includes water detection and layering functions, programmable color palette generation, several built-in textures for hill shading, 2D and 3D plotting options, a built-in path tracer, 'Wavefront' OBJ file export, and the ability to save 3D visualizations to a 3D printable format.
Maintained by Tyler Morgan-Wall. Last updated 2 months ago.
2.1k stars 11.55 score 1.5k scripts 5 dependentstylermorganwall
rayrender:Build and Raytrace 3D Scenes
Render scenes using pathtracing. Build 3D scenes out of spheres, cubes, planes, disks, triangles, cones, curves, line segments, cylinders, ellipsoids, and 3D models in the 'Wavefront' OBJ file format or the PLY Polygon File Format. Supports several material types, textures, multicore rendering, and tone-mapping. Based on the "Ray Tracing in One Weekend" book series. Peter Shirley (2018) <https://raytracing.github.io>.
Maintained by Tyler Morgan-Wall. Last updated 4 days ago.
631 stars 10.87 score 188 scripts 8 dependentsmarce10
warbleR:Streamline Bioacoustic Analysis
Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
animal-acoustic-signalsaudio-processingbioacousticsspectrogramstreamline-analysiscpp
56 stars 10.86 score 270 scripts 4 dependentsr-lib
vdiffr:Visual Regression Testing and Graphical Diffing
An extension to the 'testthat' package that makes it easy to add graphical unit tests. It provides a Shiny application to manage the test cases.
Maintained by Lionel Henry. Last updated 5 months ago.
ggplot2graphicstestthatlibpngcpp
191 stars 10.84 score 254 scripts 5 dependentsrstudio
pointblank:Data Validation and Organization of Metadata for Local and Remote Tables
Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Maintained by Richard Iannone. Last updated 5 days ago.
data-assertionsdata-checkerdata-dictionariesdata-framesdata-inferencedata-managementdata-profilerdata-qualitydata-validationdata-verificationdatabase-tableseasy-to-understandreporting-toolschema-validationtesting-toolsyaml-configuration
942 stars 10.73 score 284 scriptscaseyyoungflesh
MCMCvis:Tools to Visualize, Manipulate, and Summarize MCMC Output
Performs key functions for MCMC analysis using minimal code - visualizes, manipulates, and summarizes MCMC output. Functions support simple and straightforward subsetting of model parameters within the calls, and produce presentable and 'publication-ready' output. MCMC output may be derived from Bayesian model output fit with Stan, NIMBLE, JAGS, and other software.
Maintained by Casey Youngflesh. Last updated 4 months ago.
38 stars 10.52 score 1.8k scripts 5 dependentsinsightsengineering
teal.modules.clinical:'teal' Modules for Standard Clinical Outputs
Provides user-friendly tools for creating and customizing clinical trial reports. By leveraging the 'teal' framework, this package provides 'teal' modules to easily create an interactive panel that allows for seamless adjustments to data presentation, thereby streamlining the creation of detailed and accurate reports.
Maintained by Dawid Kaledkowski. Last updated 1 months ago.
clinical-trialsmodulesnestoutputsshiny
35 stars 10.21 score 149 scriptsn8thangreen
BCEA:Bayesian Cost Effectiveness Analysis
Produces an economic evaluation of a sample of suitable variables of cost and effectiveness / utility for two or more interventions, e.g. from a Bayesian model in the form of MCMC simulations. This package computes the most cost-effective alternative and produces graphical summaries and probabilistic sensitivity analysis, see Baio et al (2017) <doi:10.1007/978-3-319-55718-2>.
Maintained by Gianluca Baio. Last updated 2 months ago.
3 stars 9.90 score 243 scripts 3 dependentsndphillips
FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees
Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.
Maintained by Hansjoerg Neth. Last updated 5 months ago.
136 stars 9.53 score 144 scriptsphilchalmers
mirtCAT:Computerized Adaptive Testing with Multidimensional Item Response Theory
Provides tools to generate HTML interfaces for adaptive and non-adaptive tests using the shiny package (Chalmers (2016) <doi:10.18637/jss.v071.i05>). Suitable for applying unidimensional and multidimensional computerized adaptive tests (CAT) using item response theory methodology and for creating simple questionnaires forms to collect response data directly in R. Additionally, optimal test designs (e.g., "shadow testing") are supported for tests that contain a large number of item selection constraints. Finally, package contains tools useful for performing Monte Carlo simulations for studying test item banks.
Maintained by Phil Chalmers. Last updated 5 months ago.
95 stars 9.47 score 62 scripts 3 dependentsnealrichardson
httptest:A Test Environment for HTTP Requests
Testing and documenting code that communicates with remote servers can be painful. Dealing with authentication, server state, and other complications can make testing seem too costly to bother with. But it doesn't need to be that hard. This package enables one to test all of the logic on the R sides of the API in your package without requiring access to the remote service. Importantly, it provides three contexts that mock the network connection in different ways, as well as testing functions to assert that HTTP requests were---or were not---made. It also allows one to safely record real API responses to use as test fixtures. The ability to save responses and load them offline also enables one to write vignettes and other dynamic documents that can be distributed without access to a live server.
Maintained by Neal Richardson. Last updated 1 years ago.
81 stars 9.46 score 276 scripts 1 dependentsthinkr-open
fusen:Build a Package from Rmarkdown Files
Use Rmarkdown First method to build your package. Start your package with documentation, functions, examples and tests in the same unique file. Everything can be set from the Rmarkdown template file provided in your project, then inflated as a package. Inflating the template copies the relevant chunks and sections in the appropriate files required for package development.
Maintained by Vincent Guyader. Last updated 2 months ago.
163 stars 9.45 score 35 scriptsnealrichardson
httptest2:Test Helpers for 'httr2'
Testing and documenting code that communicates with remote servers can be painful. This package helps with writing tests for packages that use 'httr2'. It enables testing all of the logic on the R sides of the API without requiring access to the remote service, and it also allows recording real API responses to use as test fixtures. The ability to save responses and load them offline also enables writing vignettes and other dynamic documents that can be distributed without access to a live server.
Maintained by Neal Richardson. Last updated 9 months ago.
33 stars 9.37 score 95 scripts 1 dependentscbielow
PTXQC:Quality Report Generation for MaxQuant and mzTab Results
Generates Proteomics (PTX) quality control (QC) reports for shotgun LC-MS data analyzed with the MaxQuant software suite (from .txt files) or mzTab files (ideally from OpenMS 'QualityControl' tool). Reports are customizable (target thresholds, subsetting) and available in HTML or PDF format. Published in J. Proteome Res., Proteomics Quality Control: Quality Control Software for MaxQuant Results (2015) <doi:10.1021/acs.jproteome.5b00780>.
Maintained by Chris Bielow. Last updated 1 years ago.
drag-and-drophacktoberfestheatmapmatch-between-runsmaxquantmetricmztabopenmsproteomicsquality-controlquality-metricsreport
42 stars 9.35 score 105 scripts 1 dependentsbioc
BatchQC:Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Maintained by Jessica Anderson. Last updated 13 days ago.
batcheffectgraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
7 stars 9.06 score 54 scriptsrstudio
shinytest:Test Shiny Apps
Please see the shinytest to shinytest2 migration guide at <https://rstudio.github.io/shinytest2/articles/z-migration.html>.
Maintained by Winston Chang. Last updated 10 months ago.
225 stars 9.02 score 352 scriptsbioc
scPipe:Pipeline for single cell multi-omic data pre-processing
A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.
Maintained by Shian Su. Last updated 3 months ago.
immunooncologysoftwaresequencingrnaseqgeneexpressionsinglecellvisualizationsequencematchingpreprocessingqualitycontrolgenomeannotationdataimportcurlbzip2xz-utilszlibcpp
68 stars 9.02 score 84 scriptsr-spatial
link2GI:Linking Geographic Information Systems, Remote Sensing and Other Command Line Tools
Functions and tools for using open GIS and remote sensing command-line interfaces in a reproducible environment.
Maintained by Chris Reudenbach. Last updated 4 months ago.
26 stars 8.99 score 78 scripts 1 dependentsappsilon
rhino:A Framework for Enterprise Shiny Applications
A framework that supports creating and extending enterprise Shiny applications using best practices.
Maintained by Kamil Żyła. Last updated 3 days ago.
305 stars 8.99 score 145 scriptspharmar
riskmetric:Risk Metrics to Evaluating R Packages
Facilities for assessing R packages against a number of metrics to help quantify their robustness.
Maintained by Eli Miller. Last updated 7 days ago.
166 stars 8.98 score 43 scriptsajrgodfrey
BrailleR:Improved Access for Blind Users
Blind users do not have access to the graphical output from R without printing the content of graphics windows to an embosser of some kind. This is not as immediate as is required for efficient access to statistical output. The functions here are created so that blind people can make even better use of R. This includes the text descriptions of graphs, convenience functions to replace the functionality offered in many GUI front ends, and experimental functionality for optimising graphical content to prepare it for embossing as tactile images.
Maintained by A. Jonathan R. Godfrey. Last updated 12 months ago.
123 stars 8.90 score 143 scriptspik-piam
remind2:The REMIND R package (2nd generation)
Contains the REMIND-specific routines for data and model output manipulation.
Maintained by Renato Rodrigues. Last updated 3 days ago.
8.87 score 161 scripts 5 dependentsdvats
mcmcse:Monte Carlo Standard Errors for MCMC
Provides tools for computing Monte Carlo standard errors (MCSE) in Markov chain Monte Carlo (MCMC) settings. MCSE computation for expectation and quantile estimators is supported as well as multivariate estimations. The package also provides functions for computing effective sample size and for plotting Monte Carlo estimates versus sample size.
Maintained by Dootika Vats. Last updated 2 months ago.
effective-sample-sizemcmcoutput-aopenblascpp
12 stars 8.77 score 314 scripts 17 dependentsinsightsengineering
rbmi:Reference Based Multiple Imputation
Implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). In particular, this package supports deterministic conditional mean imputation and jackknifing as described in Wolbers et al. (2022) <doi:10.1002/pst.2234>, Bayesian multiple imputation as described in Carpenter et al. (2013) <doi:10.1080/10543406.2013.834911>, and bootstrapped maximum likelihood imputation as described in von Hippel and Bartlett (2021) <doi: 10.1214/20-STS793>.
Maintained by Isaac Gravestock. Last updated 1 months ago.
18 stars 8.76 score 33 scripts 1 dependentscynkra
fledge:Smoother Change Tracking and Versioning for R Packages
Streamlines the process of updating changelogs (NEWS.md) and versioning R packages developed in git repositories.
Maintained by Kirill Müller. Last updated 3 months ago.
188 stars 8.73 score 10 scriptsbioc
memes:motif matching, comparison, and de novo discovery using the MEME Suite
A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.
Maintained by Spencer Nystrom. Last updated 5 months ago.
dataimportfunctionalgenomicsgeneregulationmotifannotationmotifdiscoverysequencematchingsoftware
50 stars 8.69 score 117 scripts 1 dependentsbioc
lefser:R implementation of the LEfSE method for microbiome biomarker discovery
lefser is the R implementation of the popular microbiome biomarker discovery too, LEfSe. It uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers from two-level classes (and optional sub-classes).
Maintained by Sehyun Oh. Last updated 1 months ago.
softwaresequencingdifferentialexpressionmicrobiomestatisticalmethodclassificationbioconductor-packager01ca230551
56 stars 8.44 score 56 scriptsramikrispin
coronavirus:The 2019 Novel Coronavirus COVID-19 (2019-nCoV) Dataset
Provides a daily summary of the Coronavirus (COVID-19) cases by state/province. Data source: Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus <https://systems.jhu.edu/research/public-health/ncov/>.
Maintained by Rami Krispin. Last updated 2 years ago.
covid-19covid19covid19-datadataset
499 stars 8.25 score 716 scriptsmi2-warsaw
FSelectorRcpp:'Rcpp' Implementation of 'FSelector' Entropy-Based Feature Selection Algorithms with a Sparse Matrix Support
'Rcpp' (free of 'Java'/'Weka') implementation of 'FSelector' entropy-based feature selection algorithms based on an MDL discretization (Fayyad U. M., Irani K. B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In 13'th International Joint Conference on Uncertainly in Artificial Intelligence (IJCAI93), pages 1022-1029, Chambery, France, 1993.) <https://www.ijcai.org/Proceedings/93-2/Papers/022.pdf> with a sparse matrix support.
Maintained by Zygmunt Zawadzki. Last updated 6 months ago.
entropyfeature-selectionrcppsparse-matrixcpp
35 stars 8.22 score 78 scripts 1 dependentsr-dbi
DBItest:Testing DBI Backends
A helper that tests DBI back ends for conformity to the interface.
Maintained by Kirill Müller. Last updated 14 days ago.
24 stars 8.21 score 11 scriptsmrc-ide
malariasimulation:An individual based model for malaria
Specifies the latest and greatest malaria model.
Maintained by Giovanni Charles. Last updated 1 months ago.
17 stars 8.19 score 146 scriptsr-hyperspec
hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)
Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
Maintained by Claudia Beleites. Last updated 10 months ago.
data-wranglinghyperspectralimaginginfrarednmrramanspectroscopyuv-visxrf
16 stars 8.10 score 233 scripts 2 dependentsramiromagno
gwasrapidd:'REST' 'API' Client for the 'NHGRI'-'EBI' 'GWAS' Catalog
'GWAS' R 'API' Data Download. This package provides easy access to the 'NHGRI'-'EBI' 'GWAS' Catalog data by accessing the 'REST' 'API' <https://www.ebi.ac.uk/gwas/rest/docs/api/>.
Maintained by Ramiro Magno. Last updated 1 years ago.
thirdpartyclientbiomedicalinformaticsgenomewideassociationsnpassociation-studiesgwas-cataloghumanrest-clienttraittrait-ontology
95 stars 8.10 score 49 scripts 1 dependentsbioc
FLAMES:FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data
Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.
Maintained by Changqing Wang. Last updated 11 hours ago.
rnaseqsinglecelltranscriptomicsdataimportdifferentialsplicingalternativesplicinggeneexpressionlongreadzlibcurlbzip2xz-utilscpp
33 stars 8.04 score 12 scriptsmazamascience
MazamaSpatialUtils:Spatial Data Download and Utility Functions
A suite of conversion functions to create internally standardized spatial polygons data frames. Utility functions use these data sets to return values such as country, state, time zone, watershed, etc. associated with a set of longitude/latitude pairs. (They also make cool maps.)
Maintained by Jonathan Callahan. Last updated 5 months ago.
5 stars 8.01 score 282 scripts 2 dependentsbioc
scDD:Mixture modeling of single-cell RNA-seq data to identify genes with differential distributions
This package implements a method to analyze single-cell RNA- seq Data utilizing flexible Dirichlet Process mixture models. Genes with differential distributions of expression are classified into several interesting patterns of differences between two conditions. The package also includes functions for simulating data with these patterns from negative binomial distributions.
Maintained by Keegan Korthauer. Last updated 5 months ago.
immunooncologybayesianclusteringrnaseqsinglecellmultiplecomparisonvisualizationdifferentialexpression
33 stars 7.92 score 50 scriptsocbe-uio
BayesMallows:Bayesian Preference Learning with the Mallows Rank Model
An implementation of the Bayesian version of the Mallows rank model (Vitelli et al., Journal of Machine Learning Research, 2018 <https://jmlr.org/papers/v18/15-481.html>; Crispino et al., Annals of Applied Statistics, 2019 <doi:10.1214/18-AOAS1203>; Sorensen et al., R Journal, 2020 <doi:10.32614/RJ-2020-026>; Stein, PhD Thesis, 2023 <https://eprints.lancs.ac.uk/id/eprint/195759>). Both Metropolis-Hastings and sequential Monte Carlo algorithms for estimating the models are available. Cayley, footrule, Hamming, Kendall, Spearman, and Ulam distances are supported in the models. The rank data to be analyzed can be in the form of complete rankings, top-k rankings, partially missing rankings, as well as consistent and inconsistent pairwise preferences. Several functions for plotting and studying the posterior distributions of parameters are provided. The package also provides functions for estimating the partition function (normalizing constant) of the Mallows rank model, both with the importance sampling algorithm of Vitelli et al. and asymptotic approximation with the IPFP algorithm (Mukherjee, Annals of Statistics, 2016 <doi:10.1214/15-AOS1389>).
Maintained by Oystein Sorensen. Last updated 2 months ago.
mallows-modelopenblascppopenmp
21 stars 7.91 score 36 scripts 1 dependentspatriciamar
ShinyItemAnalysis:Test and Item Analysis via Shiny
Package including functions and interactive shiny application for the psychometric analysis of educational tests, psychological assessments, health-related and other types of multi-item measurements, or ratings from multiple raters.
Maintained by Patricia Martinkova. Last updated 12 days ago.
assessmentdifferential-item-functioningitem-analysisitem-response-theorypsychometricsshiny
45 stars 7.88 score 105 scripts 3 dependentsbioc
EBSeq:An R package for gene and isoform differential expression analysis of RNA-seq data
Differential Expression analysis at both gene and isoform level using RNA-seq data
Maintained by Xiuyu Ma. Last updated 10 days ago.
immunooncologystatisticalmethoddifferentialexpressionmultiplecomparisonrnaseqsequencingcpp
7.86 score 162 scripts 6 dependentsbioc
biodb:biodb, a library and a development framework for connecting to chemical and biological databases
The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.
Maintained by Pierrick Roger. Last updated 5 months ago.
softwareinfrastructuredataimportkeggbiologycheminformaticschemistrydatabasescpp
11 stars 7.85 score 24 scripts 6 dependentsemilopezcano
SixSigma:Six Sigma Tools for Quality Control and Improvement
Functions and utilities to perform Statistical Analyses in the Six Sigma way. Through the DMAIC cycle (Define, Measure, Analyze, Improve, Control), you can manage several Quality Management studies: Gage R&R, Capability Analysis, Control Charts, Loss Function Analysis, etc. Data frames used in the books "Six Sigma with R" [ISBN 978-1-4614-3652-2] and "Quality Control with R" [ISBN 978-3-319-24046-6], are also included in the package.
Maintained by Emilio L. Cano. Last updated 2 years ago.
quality-controlquality-improvementsix-sigmaspc
15 stars 7.82 score 169 scripts 1 dependentsmatloff
dsld:Data Science Looks at Discrimination
Statistical and graphical tools for detecting and measuring discrimination and bias, be it racial, gender, age or other. Detection and remediation of bias in machine learning algorithms. 'Python' interfaces available.
Maintained by Norm Matloff. Last updated 2 months ago.
12 stars 7.81 score 35 scriptsepiverse-trace
simulist:Simulate Disease Outbreak Line List and Contacts Data
Tools to simulate realistic raw case data for an epidemic in the form of line lists and contacts using a branching process. Simulated outbreaks are parameterised with epidemiological parameters and can have age-structured populations, age-stratified hospitalisation and death risk and time-varying case fatality risk.
Maintained by Joshua W. Lambert. Last updated 5 days ago.
epidemiologyepiverselinelistoutbreaks
8 stars 7.79 score 27 scriptsmazamascience
MazamaCoreUtils:Utility Functions for Production R Code
A suite of utility functions providing functionality commonly needed for production level projects such as logging, error handling, cache management and date-time parsing. Functions for date-time parsing and formatting require that time zones be specified explicitly, avoiding a common source of error when working with environmental time series.
Maintained by Jonathan Callahan. Last updated 4 months ago.
4 stars 7.76 score 119 scripts 5 dependentsgesistsa
oolong:Create Validation Tests for Automated Content Analysis
Intended to create standard human-in-the-loop validity tests for typical automated content analysis such as topic modeling and dictionary-based methods. This package offers a standard workflow with functions to prepare, administer and evaluate a human-in-the-loop validity test. This package provides functions for validating topic models using word intrusion, topic intrusion (Chang et al. 2009, <https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models>) and word set intrusion (Ying et al. 2021) <doi:10.1017/pan.2021.33> tests. This package also provides functions for generating gold-standard data which are useful for validating dictionary-based methods. The default settings of all generated tests match those suggested in Chang et al. (2009) and Song et al. (2020) <doi:10.1080/10584609.2020.1723752>.
Maintained by Chung-hong Chan. Last updated 1 months ago.
textanalysistopicmodelingvalidation
55 stars 7.58 score 23 scriptsbiogenies
tidysq:Tidy Processing and Analysis of Biological Sequences
A tidy approach to analysis of biological sequences. All processing and data-storage functions are heavily optimized to allow the fastest and most efficient data storage.
Maintained by Dominik Rafacz. Last updated 3 months ago.
bioconductorbioinformaticsbiological-sequencesfastas3sequencestibbletidytidyversevctrscpp
40 stars 7.56 score 38 scriptswincowgerdev
OpenSpecy:Analyze, Process, Identify, and Share Raman and (FT)IR Spectra
Raman and (FT)IR spectral analysis tool for plastic particles and other environmental samples (Cowger et al. 2021, <doi:10.1021/acs.analchem.1c00123>). With read_any(), Open Specy provides a single function for reading individual, batch, or map spectral data files like .asp, .csv, .jdx, .spc, .spa, .0, and .zip. process_spec() simplifies processing spectra, including smoothing, baseline correction, range restriction and flattening, intensity conversions, wavenumber alignment, and min-max normalization. Spectra can be identified in batch using an onboard reference library (Cowger et al. 2020, <doi:10.1177/0003702820929064>) using match_spec(). A Shiny app is available via run_app() or online at <https://openanalysis.org/openspecy/>.
Maintained by Win Cowger. Last updated 1 months ago.
29 stars 7.55 score 22 scriptssem-in-r
seminr:Building and Estimating Structural Equation Models
A powerful, easy to syntax for specifying and estimating complex Structural Equation Models. Models can be estimated using Partial Least Squares Path Modeling or Covariance-Based Structural Equation Modeling or covariance based Confirmatory Factor Analysis. Methods described in Ray, Danks, and Valdez (2021).
Maintained by Nicholas Patrick Danks. Last updated 3 years ago.
common-factorscompositesconstructpls-models
62 stars 7.46 score 284 scriptskoheiw
seededlda:Seeded Sequential LDA for Topic Modeling
Seeded Sequential LDA can classify sentences of texts into pre-define topics with a small number of seed words (Watanabe & Baturo, 2023) <doi:10.1177/08944393231178605>. Implements Seeded LDA (Lu et al., 2010) <doi:10.1109/ICDMW.2011.125> and Sequential LDA (Du et al., 2012) <doi:10.1007/s10115-011-0425-1> with the distributed LDA algorithm (Newman, et al., 2009) for parallel computing.
Maintained by Kohei Watanabe. Last updated 2 months ago.
semi-supervised-learningtext-classificationonetbbcpp
75 stars 7.38 score 177 scripts 1 dependentsbioc
cogena:co-expressed gene-set enrichment analysis
cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.
Maintained by Zhilong Jia. Last updated 5 months ago.
clusteringgenesetenrichmentgeneexpressionvisualizationpathwayskegggomicroarraysequencingsystemsbiologydatarepresentationdataimportbioconductorbioinformatics
12 stars 7.36 score 32 scriptshedgehogqa
hedgehog:Property-Based Testing
Hedgehog will eat all your bugs. 'Hedgehog' is a property-based testing package in the spirit of 'QuickCheck'. With 'Hedgehog', one can test properties of their programs against randomly generated input, providing far superior test coverage compared to unit testing. One of the key benefits of 'Hedgehog' is integrated shrinking of counterexamples, which allows one to quickly find the cause of bugs, given salient examples when incorrect behaviour occurs.
Maintained by Huw Campbell. Last updated 4 years ago.
56 stars 7.33 score 63 scripts 1 dependentsnerler
JointAI:Joint Analysis and Imputation of Incomplete Data
Joint analysis and imputation of incomplete data in the Bayesian framework, using (generalized) linear (mixed) models and extensions there of, survival models, or joint models for longitudinal and survival data, as described in Erler, Rizopoulos and Lesaffre (2021) <doi:10.18637/jss.v100.i20>. Incomplete covariates, if present, are automatically imputed. The package performs some preprocessing of the data and creates a 'JAGS' model, which will then automatically be passed to 'JAGS' <https://mcmc-jags.sourceforge.io/> with the help of the package 'rjags'.
Maintained by Nicole S. Erler. Last updated 12 months ago.
bayesiangeneralized-linear-modelsglmglmmimputationimputationsjagsjoint-analysislinear-mixed-modelslinear-regression-modelsmcmc-samplemcmc-samplingmissing-datamissing-valuessurvivalcpp
28 stars 7.30 score 59 scripts 1 dependentshoxo-m
githubinstall:A Helpful Way to Install R Packages Hosted on GitHub
Provides an helpful way to install packages hosted on GitHub.
Maintained by Koji Makiyama. Last updated 7 years ago.
49 stars 7.29 score 177 scriptswwiecek
baggr:Bayesian Aggregate Treatment Effects
Running and comparing meta-analyses of data with hierarchical Bayesian models in Stan, including convenience functions for formatting data, plotting and pooling measures specific to meta-analysis. This implements many models from Meager (2019) <doi:10.1257/app.20170299>.
Maintained by Witold Wiecek. Last updated 7 days ago.
bayesian-statisticsmeta-analysisquantile-regressionstantreatment-effectscpp
49 stars 7.24 score 88 scriptsinbo
checklist:A Thorough and Strict Set of Checks for R Packages and Source Code
An opinionated set of rules for R packages and R source code projects.
Maintained by Thierry Onkelinx. Last updated 1 months ago.
checklistcontinuous-integrationcontinuous-testingquality-assurance
19 stars 7.24 score 21 scripts 2 dependentsinsightsengineering
tern.mmrm:Tables and Graphs for Mixed Models for Repeated Measures (MMRM)
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see for example Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E>. This package provides an interface for fitting MMRM within the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023). It builds on 'mmrm' <https://cran.r-project.org/package=mmrm> by Sabanés Bové et al. (2023) for the actual MMRM computations.
Maintained by Joe Zhu. Last updated 6 months ago.
graphslistingsstatistical-engineeringtables
6 stars 7.23 score 8 scripts 1 dependentspik-piam
lucode2:Code Manipulation and Analysis Tools
A collection of tools which allow to manipulate and analyze code.
Maintained by Jan Philipp Dietrich. Last updated 10 days ago.
7.22 score 364 scripts 8 dependentsluckinet
tabshiftr:Reshape Disorganised Messy Data
Helps the user to build and register schema descriptions of disorganised (messy) tables. Disorganised tables are tables that are not in a topologically coherent form, where packages such as 'tidyr' could be used for reshaping. The schema description documents the arrangement of input tables and is used to reshape them into a standardised (tidy) output format.
Maintained by Steffen Ehrmann. Last updated 1 months ago.
data-managementdata-reshapingschemas
6 stars 7.13 score 62 scripts 1 dependentsmelissagwolf
dynamic:DFI Cutoffs for Latent Variable Models
Returns dynamic fit index (DFI) cutoffs for latent variable models that are tailored to the user's model statement, model type, and sample size. This is the counterpart of the Shiny Application, <https://dynamicfit.app>.
Maintained by Melissa G. Wolf. Last updated 3 months ago.
16 stars 7.13 score 139 scriptsloelschlaeger
fHMM:Fitting Hidden Markov Models to Financial Data
Fitting (hierarchical) hidden Markov models to financial data via maximum likelihood estimation. See Oelschläger, L. and Adam, T. "Detecting Bearish and Bullish Markets in Financial Time Series Using Hierarchical Hidden Markov Models" (2021, Statistical Modelling) <doi:10.1177/1471082X211034048> for a reference on the method. A user guide is provided by the accompanying software paper "fHMM: Hidden Markov Models for Financial Time Series in R", Oelschläger, L., Adam, T., and Michels, R. (2024, Journal of Statistical Software) <doi:10.18637/jss.v109.i09>.
Maintained by Lennart Oelschläger. Last updated 8 days ago.
financehidden-markov-modelscppopenmp
17 stars 7.04 score 5 scriptsr-lib
roxygen2md:'Roxygen' to 'Markdown'
Converts elements of 'roxygen' documentation to 'markdown'.
Maintained by Kirill Müller. Last updated 4 months ago.
68 stars 7.00 score 11 scripts 2 dependentsdoccstat
fastcpd:Fast Change Point Detection via Sequential Gradient Descent
Implements fast change point detection algorithm based on the paper "Sequential Gradient Descent and Quasi-Newton's Method for Change-Point Analysis" by Xianyang Zhang, Trisha Dawn <https://proceedings.mlr.press/v206/zhang23b.html>. The algorithm is based on dynamic programming with pruning and sequential gradient descent. It is able to detect change points a magnitude faster than the vanilla Pruned Exact Linear Time(PELT). The package includes examples of linear regression, logistic regression, Poisson regression, penalized linear regression data, and whole lot more examples with custom cost function in case the user wants to use their own cost function.
Maintained by Xingchi Li. Last updated 12 hours ago.
change-point-detectioncppcustom-functiongradient-descentlassolinear-regressionlogistic-regressionofflinepeltpenalized-regressionpoisson-regressionquasi-newtonstatisticstime-serieswarm-startfortranopenblascppopenmp
21 stars 6.98 score 7 scriptsbioc
CoGAPS:Coordinated Gene Activity in Pattern Sets
Coordinated Gene Activity in Pattern Sets (CoGAPS) implements a Bayesian MCMC matrix factorization algorithm, GAPS, and links it to gene set statistic methods to infer biological process activity. It can be used to perform sparse matrix factorization on any data, and when this data represents biomolecules, to do gene set analysis.
Maintained by Elana J. Fertig. Last updated 19 days ago.
geneexpressiontranscriptiongenesetenrichmentdifferentialexpressionbayesianclusteringtimecoursernaseqmicroarraymultiplecomparisondimensionreductionimmunooncologycpp
6.97 score 104 scriptsthinkr-open
thinkr:Tools for Cleaning Up Messy Files
Some tools for cleaning up messy 'Excel' files to be suitable for R. People who have been working with 'Excel' for years built more or less complicated sheets with names, characters, formats that are not homogeneous. To be able to use them in R nowadays, we built a set of functions that will avoid the majority of importation problems and keep all the data at best.
Maintained by Vincent Guyader. Last updated 3 years ago.
hacktoberfestthinkr-not-maintained
29 stars 6.96 score 45 scriptspatrick:Parameterized Unit Testing
This is an extension of the 'testthat' package that lets you add parameters to your unit tests. Parameterized unit tests are often easier to read and more reliable, since they follow the DNRY (do not repeat yourself) rule.
Maintained by Michael Quinn. Last updated 23 days ago.
139 stars 6.92 score 19 scriptslcrawlab
mvMAPIT:Multivariate Genome Wide Marginal Epistasis Test
Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this package, we present the 'multivariate MArginal ePIstasis Test' ('mvMAPIT') – a multi-outcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact – thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search based methods. Our proposed 'mvMAPIT' builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate 'mvMAPIT' as a multivariate linear mixed model and develop a multi-trait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. Crawford et al. (2017) <doi:10.1371/journal.pgen.1006869>. Stamp et al. (2023) <doi:10.1093/g3journal/jkad118>.
Maintained by Julian Stamp. Last updated 5 months ago.
cppepistasisepistasis-analysisgwasgwas-toolslinear-mixed-modelsmapitmvmapitvariance-componentsopenblascppopenmp
11 stars 6.90 score 17 scripts 1 dependentspik-piam
edgeTransport:Prepare EDGE Transport Data for the REMIND model
EDGE-T is a fork of the GCAM transport module https://jgcri.github.io/gcam-doc/energy.html#transportation with a high level of detail in its representation of technological and modal options. It is a partial equilibrium model with a nested multinomial logit structure and relies on the modified logit formulation. Most of the sources are not publicly available. PIK-internal users can find the sources in the distributed file system in the folder `/p/projects/rd3mod/inputdata/sources/EDGE-Transport-Standalone`.
Maintained by Johanna Hoppe. Last updated 2 days ago.
5 stars 6.84 score 16 scripts 2 dependentskozodoi
fairness:Algorithmic Fairness Metrics
Offers calculation, visualization and comparison of algorithmic fairness metrics. Fair machine learning is an emerging topic with the overarching aim to critically assess whether ML algorithms reinforce existing social biases. Unfair algorithms can propagate such biases and produce predictions with a disparate impact on various sensitive groups of individuals (defined by sex, gender, ethnicity, religion, income, socioeconomic status, physical or mental disabilities). Fair algorithms possess the underlying foundation that these groups should be treated similarly or have similar prediction outcomes. The fairness R package offers the calculation and comparisons of commonly and less commonly used fairness metrics in population subgroups. These methods are described by Calders and Verwer (2010) <doi:10.1007/s10618-010-0190-x>, Chouldechova (2017) <doi:10.1089/big.2016.0047>, Feldman et al. (2015) <doi:10.1145/2783258.2783311> , Friedler et al. (2018) <doi:10.1145/3287560.3287589> and Zafar et al. (2017) <doi:10.1145/3038912.3052660>. The package also offers convenient visualizations to help understand fairness metrics.
Maintained by Nikita Kozodoi. Last updated 2 years ago.
algorithmic-discriminationalgorithmic-fairnessdiscriminationdisparate-impactfairnessfairness-aifairness-mlmachine-learning
32 stars 6.82 score 69 scripts 1 dependentsropensci
ohun:Optimizing Acoustic Signal Detection
Facilitates the automatic detection of acoustic signals, providing functions to diagnose and optimize the performance of detection routines. Detections from other software can also be explored and optimized. This package has been peer-reviewed by rOpenSci. Araya-Salas et al. (2022) <doi:10.1101/2022.12.13.520253>.
Maintained by Marcelo Araya-Salas. Last updated 5 months ago.
audio-processingbioacousticssound-event-detectionspectrogramstreamline-analysis
14 stars 6.78 score 24 scripts 1 dependentsthinkr-open
checkhelper:Deal with Check Outputs
Deal with packages 'check' outputs and reduce the risk of rejection by 'CRAN' by following policies.
Maintained by Sebastien Rochette. Last updated 1 years ago.
34 stars 6.74 score 18 scriptsfrbcesab
rcompendium:Create a Package or Research Compendium Structure
Makes easier the creation of R package or research compendium (i.e. a predefined files/folders structure) so that users can focus on the code/analysis instead of wasting time organizing files. A full ready-to-work structure is set up with some additional features: version control, remote repository creation, CI/CD configuration (check package integrity under several OS, test code with 'testthat', and build and deploy website using 'pkgdown'). This package heavily relies on the R packages 'devtools' and 'usethis' and follows recommendations made by Wickham H. (2015) <ISBN:9781491910597> and Marwick B. et al. (2018) <doi:10.7287/peerj.preprints.3192v2>.
Maintained by Nicolas Casajus. Last updated 2 months ago.
reproducible-researchresearch-compendium
40 stars 6.72 score 22 scriptsimbi-heidelberg
DescrTab2:Publication Quality Descriptive Statistics Tables
Provides functions to create descriptive statistics tables for continuous and categorical variables. By default, summary statistics such as mean, standard deviation, quantiles, minimum and maximum for continuous variables and relative and absolute frequencies for categorical variables are calculated. 'DescrTab2' features a sophisticated algorithm to choose appropriate test statistics for your data and provides p-values. On top of this, confidence intervals for group differences of appropriated summary measures are automatically produces for two-group comparison. Tables generated by 'DescrTab2' can be integrated in a variety of document formats, including .html, .tex and .docx documents. 'DescrTab2' also allows printing tables to console and saving table objects for later use.
Maintained by Jan Meis. Last updated 1 years ago.
categorical-variablescontinuous-variabledescriptive-statisticsp-valuesstatistical-testsstatistics
9 stars 6.71 score 19 scripts 1 dependentsbioc
syntenet:Inference And Analysis Of Synteny Networks
syntenet can be used to infer synteny networks from whole-genome protein sequences and analyze them. Anchor pairs are detected with the MCScanX algorithm, which was ported to this package with the Rcpp framework for R and C++ integration. Anchor pairs from synteny analyses are treated as an undirected unweighted graph (i.e., a synteny network), and users can perform: i. network clustering; ii. phylogenomic profiling (by identifying which species contain which clusters) and; iii. microsynteny-based phylogeny reconstruction with maximum likelihood.
Maintained by Fabrício Almeida-Silva. Last updated 3 months ago.
softwarenetworkinferencefunctionalgenomicscomparativegenomicsphylogeneticssystemsbiologygraphandnetworkwholegenomenetworkcomparative-genomicsevolutionary-genomicsnetwork-sciencephylogenomicssyntenysynteny-networkcpp
28 stars 6.70 score 12 scripts 1 dependentscorrelaid
newsanchor:Client for the News API
Interface to gather news from the 'News API', based on a multilevel query <https://newsapi.org/>. A personal API key is required.
Maintained by Yannik Buhl. Last updated 5 years ago.
36 stars 6.70 score 40 scriptsbioc
megadepth:megadepth: BigWig and BAM related utilities
This package provides an R interface to Megadepth by Christopher Wilks available at https://github.com/ChristopherWilks/megadepth. It is particularly useful for computing the coverage of a set of genomic regions across bigWig or BAM files. With this package, you can build base-pair coverage matrices for regions or annotations of your choice from BigWig files. Megadepth was used to create the raw files provided by https://bioconductor.org/packages/recount3.
Maintained by David Zhang. Last updated 4 months ago.
softwarecoveragedataimporttranscriptomicsrnaseqpreprocessingbambigwigdasptermegadepthrecount2recount3
12 stars 6.69 score 7 scripts 3 dependentsropensci
baRulho:Quantifying (Animal) Sound Degradation
Intended to facilitate acoustic analysis of (animal) sound propagation experiments, which typically aim to quantify changes in signal structure when transmitted in a given habitat by broadcasting and re-recording animal sounds at increasing distances. The package offers a workflow with functions to prepare the data set for analysis as well as to calculate and visualize several degradation metrics, including blur ratio, signal-to-noise ratio, excess attenuation and envelope correlation among others (Dabelsteen et al 1993 <doi:10.1121/1.406682>).
Maintained by Marcelo Araya-Salas. Last updated 5 days ago.
acoustic-signalsanimalbehaviorbioacoustics
6 stars 6.66 score 18 scriptssimnph
SimNPH:Simulate Non-Proportional Hazards
A toolkit for simulation studies concerning time-to-event endpoints with non-proportional hazards. 'SimNPH' encompasses functions for simulating time-to-event data in various scenarios, simulating different trial designs like fixed-followup, event-driven, and group sequential designs. The package provides functions to calculate the true values of common summary statistics for the implemented scenarios and offers common analysis methods for time-to-event data. Helper functions for running simulations with the 'SimDesign' package and for aggregating and presenting the results are also included. Results of the conducted simulation study are available in the paper: "A Comparison of Statistical Methods for Time-To-Event Analyses in Randomized Controlled Trials Under Non-Proportional Hazards", Klinglmüller et al. (2025) <doi:10.1002/sim.70019>.
Maintained by Tobias Fellinger. Last updated 27 days ago.
clinical-trial-simulationsnon-proportional-hazardsstatistical-simulationstatisticssurvival-analysis
6 stars 6.63 score 43 scriptsmazamascience
AirMonitor:Air Quality Data Analysis
Utilities for working with hourly air quality monitoring data with a focus on small particulates (PM2.5). A compact data model is structured as a list with two dataframes. A 'meta' dataframe contains spatial and measuring device metadata associated with deployments at known locations. A 'data' dataframe contains a 'datetime' column followed by columns of measurements associated with each "device-deployment". Algorithms to calculate NowCast and the associated Air Quality Index (AQI) are defined at the US Environmental Projection Agency AirNow program: <https://document.airnow.gov/technical-assistance-document-for-the-reporting-of-daily-air-quailty.pdf>.
Maintained by Jonathan Callahan. Last updated 6 months ago.
7 stars 6.57 score 178 scriptsinsightsengineering
tern.rbmi:Create Interface for 'RBMI' and 'tern'
'RBMI' implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). This package provides an interface for 'RBMI' uses the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023).
Maintained by Joe Zhu. Last updated 24 days ago.
3 stars 6.53 score 3 scriptsjakesherman
easypackages:Easy Loading and Installing of Packages
Easily load and install multiple packages from different sources, including CRAN and GitHub. The libraries function allows you to load or attach multiple packages in the same function call. The packages function will load one or more packages, and install any packages that are not installed on your system (after prompting you). Also included is a from_import function that allows you to import specific functions from a package into the global environment.
Maintained by Jake Sherman. Last updated 7 years ago.
11 stars 6.52 score 490 scriptsgisma
uavRmp:UAV Mission Planner
The Unmanned Aerial Vehicle Mission Planner provides an easy to use work flow for planning autonomous obstacle avoiding surveys of ready to fly unmanned aerial vehicles to retrieve aerial or spot related data. It creates either intermediate flight control files for the DJI-Litchi supported series or ready to upload control files for the pixhawk-based flight controller as used in the 3DR-Solo or Yuneec series. Additionally it contains some useful tools for digitizing and data manipulation.
Maintained by Chris Reudenbach. Last updated 10 months ago.
cultural-heritagedjidroneflight-planningforest-mappinglitchilow-budget-uavmission-planningphotogrammetrypixhawkpixhawk-controllerqgroundcontrol2litchisolosurveyterrain-followingterrain-mappinguavsyuneec
25 stars 6.48 score 6 scriptsbachmannpatrick
CLVTools:Tools for Customer Lifetime Value Estimation
A set of state-of-the-art probabilistic modeling approaches to derive estimates of individual customer lifetime values (CLV). Commonly, probabilistic approaches focus on modelling 3 processes, i.e. individuals' attrition, transaction, and spending process. Latent customer attrition models, which are also known as "buy-'til-you-die models", model the attrition as well as the transaction process. They are used to make inferences and predictions about transactional patterns of individual customers such as their future purchase behavior. Moreover, these models have also been used to predict individuals’ long-term engagement in activities such as playing an online game or posting to a social media platform. The spending process is usually modelled by a separate probabilistic model. Combining these results yields in lifetime values estimates for individual customers. This package includes fast and accurate implementations of various probabilistic models for non-contractual settings (e.g., grocery purchases or hotel visits). All implementations support time-invariant covariates, which can be used to control for e.g., socio-demographics. If such an extension has been proposed in literature, we further provide the possibility to control for time-varying covariates to control for e.g., seasonal patterns. Currently, the package includes the following latent attrition models to model individuals' attrition and transaction process: [1] Pareto/NBD model (Pareto/Negative-Binomial-Distribution), [2] the Extended Pareto/NBD model (Pareto/Negative-Binomial-Distribution with time-varying covariates), [3] the BG/NBD model (Beta-Gamma/Negative-Binomial-Distribution) and the [4] GGom/NBD (Gamma-Gompertz/Negative-Binomial-Distribution). Further, we provide an implementation of the Gamma/Gamma model to model the spending process of individuals.
Maintained by Patrick Bachmann. Last updated 4 months ago.
clvcustomer-lifetime-valuecustomer-relationship-managementopenblasgslcppopenmp
55 stars 6.47 score 12 scriptsbstatcomp
bayes4psy:User Friendly Bayesian Data Analysis for Psychology
Contains several Bayesian models for data analysis of psychological tests. A user friendly interface for these models should enable students and researchers to perform professional level Bayesian data analysis without advanced knowledge in programming and Bayesian statistics. This package is based on the Stan platform (Carpenter et el. 2017 <doi:10.18637/jss.v076.i01>).
Maintained by Jure Demšar. Last updated 2 years ago.
14 stars 6.44 score 33 scriptsbioc
doubletrouble:Identification and classification of duplicated genes
doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.
Maintained by Fabrício Almeida-Silva. Last updated 18 days ago.
softwarewholegenomecomparativegenomicsfunctionalgenomicsphylogeneticsnetworkclassificationbioinformaticscomparative-genomicsgene-duplicationmolecular-evolutionwhole-genome-duplication
23 stars 6.44 score 17 scriptsalexpghayes
modeltests:Testing Infrastructure for Broom Model Generics
Provides a number of testthat tests that can be used to verify that tidy(), glance() and augment() methods meet consistent specifications. This allows methods for the same generic to be spread across multiple packages, since all of those packages can make the same guarantees to users about returned objects.
Maintained by Alex Hayes. Last updated 11 months ago.
6 stars 6.42 score 396 scriptsjakubsob
cucumber:Behavior-Driven Development for R
Write executable specifications in a natural language that describes how your code should behave. Write specifications in feature files using 'Gherkin' language and execute them using functions implemented in R. Use them as an extension to your 'testthat' tests to provide a high level description of how your code works.
Maintained by Jakub Sobolewski. Last updated 11 days ago.
acceptance-testingbehavior-driven-developmentcucumbertesting
13 stars 6.37 score 10 scriptsbioc
MSstatsShiny:MSstats GUI for Statistical Anaylsis of Proteomics Experiments
MSstatsShiny is an R-Shiny graphical user interface (GUI) integrated with the R packages MSstats, MSstatsTMT, and MSstatsPTM. It provides a point and click end-to-end analysis pipeline applicable to a wide variety of experimental designs. These include data-dependedent acquisitions (DDA) which are label-free or tandem mass tag (TMT)-based, as well as DIA, SRM, and PRM acquisitions and those targeting post-translational modifications (PTMs). The application automatically saves users selections and builds an R script that recreates their analysis, supporting reproducible data analysis.
Maintained by Devon Kohler. Last updated 5 months ago.
immunooncologymassspectrometryproteomicssoftwareshinyappsdifferentialexpressiononechanneltwochannelnormalizationqualitycontrolgui
15 stars 6.31 score 4 scriptspik-piam
mrremind:MadRat REMIND Input Data Package
The mrremind packages contains data preprocessing for the REMIND model.
Maintained by Lavinia Baumstark. Last updated 3 days ago.
4 stars 6.25 score 15 scripts 1 dependentsropensci
autotest:Automatic Package Testing
Automatic testing of R packages via a simple YAML schema.
Maintained by Mark Padgham. Last updated 5 months ago.
automated-testingfuzzingtesting
54 stars 6.21 score 25 scriptsms-quality-hub
rmzqc:Creation, Reading and Validation of 'mzqc' Files
Reads, writes and validates 'mzQC' files. The 'mzQC' format is a standardized file format for the exchange, transmission, and archiving of quality metrics derived from biological mass spectrometry data, as defined by the HUPO-PSI (Human Proteome Organisation - Proteomics Standards Initiative) Quality Control working group. See <https://hupo-psi.github.io/mzQC/> for details.
Maintained by Chris Bielow. Last updated 3 days ago.
hacktoberfestmass-spectrometrymzqcquality-control
3 stars 6.21 score 10 scripts 3 dependentsmansmeg
markmyassignment:Automatic Marking of R Assignments
Automatic marking of R assignments for students and teachers based on 'testthat' test suites.
Maintained by Mans Magnusson. Last updated 1 years ago.
5 stars 6.19 score 155 scriptsnhs-r-community
NHSRwaitinglist:R-Package to Implement a Waiting List Management Using Queuing Theory
R-package to implement the waiting list management approach described in Fong et al. 2022 <doi:10.1101/2022.08.23.22279117>.
Maintained by Chris Mainey. Last updated 4 hours ago.
nhsnhs-r-communityqueuing-theorywaiting-list
18 stars 6.17 score 17 scriptslucas-castillo
samplr:Compare Human Performance to Sampling Algorithms
Understand human performance from the perspective of sampling, both looking at how people generate samples and how people use the samples they have generated. A longer overview and other resources can be found at <https://sampling.warwick.ac.uk>.
Maintained by Lucas Castillo. Last updated 6 hours ago.
2 stars 6.15 score 25 scriptssnystrom
cmdfun:Framework for Building Interfaces to Shell Commands
Writing interfaces to command line software is cumbersome. 'cmdfun' provides a framework for building function calls to seamlessly interface with shell commands by allowing lazy evaluation of command line arguments. 'cmdfun' also provides methods for handling user-specific paths to tool installs or secrets like API keys. Its focus is to equally serve package builders who wish to wrap command line software, and to help analysts stay inside R when they might usually leave to execute non-R software.
Maintained by Spencer Nystrom. Last updated 4 years ago.
15 stars 6.13 score 7 scripts 6 dependentsramikrispin
covid19italy:The 2019 Novel Coronavirus COVID-19 (2019-nCoV) Italy Dataset
Provides a daily summary of the Coronavirus (COVID-19) cases in Italy by country, region and province level. Data source: Presidenza del Consiglio dei Ministri - Dipartimento della Protezione Civile <https://www.protezionecivile.it/>.
Maintained by Rami Krispin. Last updated 2 years ago.
47 stars 6.07 score 25 scriptsludvigolsen
xpectr:Generates Expectations for 'testthat' Unit Testing
Helps systematize and ease the process of building unit tests with the 'testthat' package by providing tools for generating expectations.
Maintained by Ludvig Renbo Olsen. Last updated 26 days ago.
37 stars 6.06 score 62 scriptskforner
srcpkgs:R Source Packages Manager
Manage a collection/library of R source packages. Discover, document, load, test source packages. Enable to use those packages as if they were actually installed. Quickly reload only what is needed on source code change. Run tests and checks in parallel.
Maintained by Karl Forner. Last updated 10 months ago.
11 stars 6.04 score 6 scriptsirods
rirods:R Client for 'iRODS'
The open sourced data management software 'Integrated Rule-Oriented Data System' ('iRODS') offers solutions for the whole data life cycle (<https://irods.org/>). The loosely constructed and highly configurable architecture of 'iRODS' frees the user from strict formatting constraints and single-vendor solutions. This package provides an interface to the 'iRODS' HTTP API, allowing you to manage your data and metadata in 'iRODS' with R. Storage of annotated files and R objects in 'iRODS' ensures findability, accessibility, interoperability, and reusability of data.
Maintained by Martin Schobben. Last updated 1 years ago.
7 stars 6.04 score 31 scriptsbioc
INDEED:Interactive Visualization of Integrated Differential Expression and Differential Network Analysis for Biomarker Candidate Selection Package
An R package for integrated differential expression and differential network analysis based on omic data for cancer biomarker discovery. Both correlation and partial correlation can be used to generate differential network to aid the traditional differential expression analysis to identify changes between biomolecules on both their expression and pairwise association levels. A detailed description of the methodology has been published in Methods journal (PMID: 27592383). An interactive visualization feature allows for the exploration and selection of candidate biomarkers.
Maintained by Ressom group. Last updated 5 months ago.
immunooncologysoftwareresearchfieldbiologicalquestionstatisticalmethoddifferentialexpressionmassspectrometrymetabolomics
5 stars 6.02 score 10 scriptsmarce10
Rraven:Connecting R and 'Raven' Sound Analysis Software
A tool to exchange data between R and 'Raven' sound analysis software (Cornell Lab of Ornithology). Functions work on data formats compatible with the R package 'warbleR'.
Maintained by Marcelo Araya-Salas. Last updated 3 months ago.
10 stars 6.00 score 50 scriptsixpantia
ixplorer:Easy DataOps for R Users
Create and view tickets in 'gitea', a self-hosted git service <https://about.gitea.com>, using an 'RStudio' addin, and use helper functions to publish documentation and use git.
Maintained by Frans van Dunne. Last updated 5 months ago.
2 stars 5.94 score 5 scriptsbioc
PathoStat:PathoStat Statistical Microbiome Analysis Package
The purpose of this package is to perform Statistical Microbiome Analysis on metagenomics results from sequencing data samples. In particular, it supports analyses on the PathoScope generated report files. PathoStat provides various functionalities including Relative Abundance charts, Diversity estimates and plots, tests of Differential Abundance, Time Series visualization, and Core OTU analysis.
Maintained by Solaiappan Manimaran. Last updated 5 months ago.
microbiomemetagenomicsgraphandnetworkmicroarraypatternlogicprincipalcomponentsequencingsoftwarevisualizationrnaseqimmunooncology
8 stars 5.90 score 8 scriptsjeffreyrstevens
flashr:Create Flashcards of Terms and Definitions
Provides functions for creating flashcard decks of terms and definitions. This package creates HTML slides using 'revealjs' that can be viewed in the 'RStudio' viewer or a web browser. Users can create flashcards from either existing built-in decks or create their own from CSV files or vectors of function names.
Maintained by Jeffrey R. Stevens. Last updated 1 years ago.
9 stars 5.89 score 171 scriptsthinhong
denim:Generate and Simulate Deterministic Discrete-Time Compartmental Models
R package to build and simulate deterministic discrete-time compartmental models that can be non-Markov. Length of stay in each compartment can be defined to follow a parametric distribution (d_exponential(), d_gamma(), d_weibull(), d_lognormal()) or a non-parametric distribution (nonparametric()). Other supported types of transition from one compartment to another includes fixed transition (constant()), multinomial (multinomial()), fixed transition probability (transprob()).
Maintained by Anh Phan. Last updated 14 days ago.
2 stars 5.86 score 8 scriptsjimbrig
lossrx:Actuarial Loss Development and Reserving with R
Actuarial Loss Development and Reserving Helper Functions and ShinyApp.
Maintained by Jimmy Briggs. Last updated 3 months ago.
actuarial-scienceclaims-dataclaims-reservingdata-scienceinsurancemodellingproperty-casualtyreservingrshinyworkflow
14 stars 5.82 score 7 scriptsguokai8
microbial:Do 16s Data Analysis and Generate Figures
Provides functions to enhance the available statistical analysis procedures in R by providing simple functions to analysis and visualize the 16S rRNA data.Here we present a tutorial with minimum working examples to demonstrate usage and dependencies.
Maintained by Kai Guo. Last updated 6 months ago.
softwaregraphandnetworkmicrobiomemicrobiome-analysis
13 stars 5.81 score 25 scriptsmodeloriented
modelDown:Make Static HTML Website for Predictive Models
Website generator with HTML summaries for predictive models. This package uses 'DALEX' explainers to describe global model behavior. We can see how well models behave (tabs: Model Performance, Auditor), how much each variable contributes to predictions (tabs: Variable Response) and which variables are the most important for a given model (tabs: Variable Importance). We can also compare Concept Drift for pairs of models (tabs: Drifter). Additionally, data available on the website can be easily recreated in current R session. Work on this package was financially supported by the NCN Opus grant 2017/27/B/ST6/01307 at Warsaw University of Technology, Faculty of Mathematics and Information Science.
Maintained by Kamil Romaszko. Last updated 4 years ago.
121 stars 5.80 score 15 scriptsappsilon
shiny.benchmark:Benchmark the Performance of 'shiny' Applications
Compare performance between different versions of a 'shiny' application based on 'git' references.
Maintained by Douglas Azevedo. Last updated 12 months ago.
performance-testingrhinoverseshiny
31 stars 5.79 score 6 scriptssocialresearchcentre
testdat:Data Unit Testing for R
Test your data! An extension of the 'testthat' unit testing framework with a family of functions and reporting tools for checking and validating data frames.
Maintained by Danny Smith. Last updated 10 months ago.
8 stars 5.78 score 50 scriptsbioc
MetID:Network-based prioritization of putative metabolite IDs
This package uses an innovative network-based approach that will enhance our ability to determine the identities of significant ions detected by LC-MS.
Maintained by Zhenzhi Li. Last updated 5 months ago.
assaydomainbiologicalquestioninfrastructureresearchfieldstatisticalmethodtechnologyworkflowstepnetworkkegg
1 stars 5.74 score 110 scriptsjwiley
multilevelTools:Multilevel and Mixed Effects Model Diagnostics and Effect Sizes
Effect sizes, diagnostics and performance metrics for multilevel and mixed effects models. Includes marginal and conditional 'R2' estimates for linear mixed effects models based on Johnson (2014) <doi:10.1111/2041-210X.12225>.
Maintained by Joshua F. Wiley. Last updated 5 days ago.
4 stars 5.74 score 136 scriptsbioc
atSNP:Affinity test for identifying regulatory SNPs
atSNP performs affinity tests of motif matches with the SNP or the reference genomes and SNP-led changes in motif matches.
Maintained by Sunyoung Shin. Last updated 5 months ago.
softwarechipseqgenomeannotationmotifannotationvisualizationcpp
1 stars 5.73 score 36 scriptslimengbinggz
ddtlcm:Latent Class Analysis with Dirichlet Diffusion Tree Process Prior
Implements a Bayesian algorithm for overcoming weak separation in Bayesian latent class analysis. Reference: Li et al. (2023) <arXiv:2306.04700>.
Maintained by Mengbing Li. Last updated 8 months ago.
6 stars 5.73 score 8 scriptswa-department-of-agriculture
soils:Visualize and Report Soil Health Data
Collection of soil health data visualization and reporting tools, including a RStudio project template with everything you need to generate custom HTML and Microsoft Word reports for each participant in your soil health sampling project.
Maintained by Jadey N Ryan. Last updated 6 days ago.
10 stars 5.73 score 9 scriptsholgstr
fmeffects:Model-Agnostic Interpretations with Forward Marginal Effects
Create local, regional, and global explanations for any machine learning model with forward marginal effects. You provide a model and data, and 'fmeffects' computes feature effects. The package is based on the theory in: C. A. Scholbeck, G. Casalicchio, C. Molnar, B. Bischl, and C. Heumann (2022) <doi:10.48550/arXiv.2201.08837>.
Maintained by Holger Löwe. Last updated 5 months ago.
2 stars 5.73 score 6 scriptstiago-simoes
EvoPhylo:Pre- And Postprocessing of Morphological Data from Relaxed Clock Bayesian Phylogenetics
Performs automated morphological character partitioning for phylogenetic analyses and analyze macroevolutionary parameter outputs from clock (time-calibrated) Bayesian inference analyses, following concepts introduced by Simões and Pierce (2021) <doi:10.1038/s41559-021-01532-x>.
Maintained by Tiago Simoes. Last updated 2 years ago.
4 stars 5.66 score 19 scriptsmazamascience
MazamaLocationUtils:Manage Spatial Metadata for Known Locations
Utility functions for discovering and managing metadata associated with spatially unique "known locations". Applications include all fields of environmental monitoring (e.g. air and water quality) where data are collected at stationary sites.
Maintained by Jonathan Callahan. Last updated 4 months ago.
5.64 score 108 scriptsselcukorkmaz
PubChemR:Interface to the 'PubChem' Database for Chemical Data Retrieval
Provides an interface to the 'PubChem' database via the PUG REST <https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest> and PUG View <https://pubchem.ncbi.nlm.nih.gov/docs/pug-view> services. This package allows users to automatically access chemical and biological data from 'PubChem', including compounds, substances, assays, and various other data types. Functions are available to retrieve data in different formats, perform searches, and access detailed annotations.
Maintained by Selcuk Korkmaz. Last updated 6 months ago.
2 stars 5.62 score 23 scriptsbflammers
ANN2:Artificial Neural Networks for Anomaly Detection
Training of neural networks for classification and regression tasks using mini-batch gradient descent. Special features include a function for training autoencoders, which can be used to detect anomalies, and some related plotting functions. Multiple activation functions are supported, including tanh, relu, step and ramp. For the use of the step and ramp activation functions in detecting anomalies using autoencoders, see Hawkins et al. (2002) <doi:10.1007/3-540-46145-0_17>. Furthermore, several loss functions are supported, including robust ones such as Huber and pseudo-Huber loss, as well as L1 and L2 regularization. The possible options for optimization algorithms are RMSprop, Adam and SGD with momentum. The package contains a vectorized C++ implementation that facilitates fast training through mini-batch learning.
Maintained by Bart Lammers. Last updated 4 years ago.
anomaly-detectionartificial-neural-networksautoencodersneural-networksrobust-statisticsopenblascppopenmp
13 stars 5.59 score 60 scriptsseankross
swirl:Learn R, in R
Use the R console as an interactive learning environment. Users receive immediate feedback as they are guided through self-paced lessons in data science and R programming.
Maintained by Sean Kross. Last updated 5 years ago.
5.57 score 1.8k scripts 1 dependentsbioc
chevreulPlot:Plots used in the chevreulPlot package
Tools for plotting SingleCellExperiment objects in the chevreulPlot package. Includes functions for analysis and visualization of single-cell data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Maintained by Kevin Stachelek. Last updated 1 months ago.
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
5.56 score 2 scripts 1 dependentshughjonesd
doctest:Generate Tests from Examples Using 'roxygen' and 'testthat'
Creates 'testthat' tests from 'roxygen' examples using simple tags.
Maintained by David Hugh-Jones. Last updated 1 years ago.
33 stars 5.52 score 4 scriptsbioc
methylclock:Methylclock - DNA methylation-based clocks
This package allows to estimate chronological and gestational DNA methylation (DNAm) age as well as biological age using different methylation clocks. Chronological DNAm age (in years) : Horvath's clock, Hannum's clock, BNN, Horvath's skin+blood clock, PedBE clock and Wu's clock. Gestational DNAm age : Knight's clock, Bohlin's clock, Mayne's clock and Lee's clocks. Biological DNAm clocks : Levine's clock and Telomere Length's clock.
Maintained by Dolors Pelegri-Siso. Last updated 5 months ago.
dnamethylationbiologicalquestionpreprocessingstatisticalmethodnormalizationcpp
39 stars 5.52 score 28 scriptsbioc
GeoDiff:Count model based differential expression and normalization on GeoMx RNA data
A series of statistical models using count generating distributions for background modelling, feature and sample QC, normalization and differential expression analysis on GeoMx RNA data. The application of these methods are demonstrated by example data analysis vignette.
Maintained by Nicole Ortogero. Last updated 5 months ago.
geneexpressiondifferentialexpressionnormalizationopenblascppopenmp
8 stars 5.51 score 9 scriptsropensci
mcbette:Model Comparison Using 'babette'
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'mcbette' allows to do a Bayesian model comparison over some site and clock models, using 'babette' (<https://github.com/ropensci/babette/>).
Maintained by Richèl J.C. Bilderbeek. Last updated 8 months ago.
7 stars 5.50 score 18 scriptsineelhere
shiny.ollama:R 'shiny' Interface for Chatting with Large Language Models Offline on Local with 'ollama'
Chat with large language models like 'deepseek-r1', 'nemotron', 'llama', 'qwen' and many more on your machine without internet with complete privacy via 'ollama', powered by R 'shiny' interface. For more information on 'ollama', visit <https://ollama.com>.
Maintained by Indraneel Chakraborty. Last updated 19 days ago.
deepseek-r1llama3llmlocal-llmoffline-firstoffline-llmollamaollama-appollama-guishinyshinyapp
9 stars 5.50 score 2 scriptsbioc
MotifPeeker:Benchmarking Epigenomic Profiling Methods Using Motif Enrichment
MotifPeeker is used to compare and analyse datasets from epigenomic profiling methods with motif enrichment as the key benchmark. The package outputs an HTML report consisting of three sections: (1. General Metrics) Overview of peaks-related general metrics for the datasets (FRiP scores, peak widths and motif-summit distances). (2. Known Motif Enrichment Analysis) Statistics for the frequency of user-provided motifs enriched in the datasets. (3. De-Novo Motif Enrichment Analysis) Statistics for the frequency of de-novo discovered motifs enriched in the datasets and compared with known motifs.
Maintained by Hiranyamaya Dash. Last updated 3 months ago.
epigeneticsgeneticsqualitycontrolchipseqmultiplecomparisonfunctionalgenomicsmotifdiscoverysequencematchingsoftwarealignmentbioconductorbioconductor-packagechip-seqepigenomicsinteractive-reportmotif-enrichment-analysis
2 stars 5.48 score 6 scriptsloelschlaeger
RprobitB:Bayesian Probit Choice Modeling
Bayes estimation of probit choice models, both in the cross-sectional and panel setting. The package can analyze binary, multivariate, ordered, and ranked choices, as well as heterogeneity of choice behavior among deciders. The main functionality includes model fitting via Markov chain Monte Carlo m ethods, tools for convergence diagnostic, choice data simulation, in-sample and out-of-sample choice prediction, and model selection using information criteria and Bayes factors. The latent class model extension facilitates preference-based decider classification, where the number of latent classes can be inferred via the Dirichlet process or a weight-based updating heuristic. This allows for flexible modeling of choice behavior without the need to impose structural constraints. For a reference on the method see Oelschlaeger and Bauer (2021) <https://trid.trb.org/view/1759753>.
Maintained by Lennart Oelschläger. Last updated 6 months ago.
bayesdiscrete-choiceprobitopenblascppopenmp
4 stars 5.45 score 1 scriptsjonthegeek
beekeeper:Rapidly Scaffold API Client Packages
Automatically generate R package skeletons from 'application programming interfaces (APIs)' that follow the 'OpenAPI Specification (OAS)'. The skeletons implement best practices to streamline package development.
Maintained by Jon Harmon. Last updated 6 months ago.
53 stars 5.42 score 2 scriptsluckinet
arealDB:Harmonise and Integrate Heterogeneous Areal Data
Many relevant applications in the environmental and socioeconomic sciences use areal data, such as biodiversity checklists, agricultural statistics, or socioeconomic surveys. For applications that surpass the spatial, temporal or thematic scope of any single data source, data must be integrated from several heterogeneous sources. Inconsistent concepts, definitions, or messy data tables make this a tedious and error-prone process. 'arealDB' tackles those problems and helps the user to integrate a harmonised databases of areal data. Read the paper at Ehrmann, Seppelt & Meyer (2020) <doi:10.1016/j.envsoft.2020.104799>.
Maintained by Steffen Ehrmann. Last updated 2 months ago.
2 stars 5.41 score 15 scriptscrunch-io
crplyr:A 'dplyr' Interface for Crunch
In order to facilitate analysis of datasets hosted on the Crunch data platform <https://crunch.io/>, the 'crplyr' package implements 'dplyr' methods on top of the Crunch backend. The usual methods 'select', 'filter', 'group_by', 'summarize', and 'collect' are implemented in such a way as to perform as much computation on the server and pull as little data locally as possible.
Maintained by Greg Freedman Ellis. Last updated 2 years ago.
6 stars 5.41 score 17 scriptssimon-smart88
shinyscholar:A Template for Creating Reproducible 'shiny' Applications
Create a skeleton 'shiny' application with create_template() that is reproducible, can be saved and meets academic standards for attribution. Forked from 'wallace'. Code is split into modules that are loaded and linked together automatically and each call one function. Guidance pages explain modules to users and flexible logging informs them of any errors. Options enable asynchronous operations, viewing of source code, interactive maps and data tables. Use to create complex analytical applications, following best practices in open science and software development. Includes functions for automating repetitive development tasks and an example application at run_shinyscholar() that requires install.packages("shinyscholar", dependencies = TRUE). A guide to developing applications can be found on the package website.
Maintained by Simon E. H. Smart. Last updated 7 days ago.
22 stars 5.40 score 5 scriptsloelschlaeger
oeli:Utilities for Developing Data Science Software
Some general helper functions that I (and maybe others) find useful when developing data science software.
Maintained by Lennart Oelschläger. Last updated 4 months ago.
2 stars 5.38 score 1 scripts 4 dependentsbeerda
nuggets:Extensible Data Pattern Searching Framework
Extensible framework for subgroup discovery (Atzmueller (2015) <doi:10.1002/widm.1144>), contrast patterns (Chen (2022) <doi:10.48550/arXiv.2209.13556>), emerging patterns (Dong (1999) <doi:10.1145/312129.312191>), association rules (Agrawal (1994) <https://www.vldb.org/conf/1994/P487.PDF>) and conditional correlations (Hájek (1978) <doi:10.1007/978-3-642-66943-9>). Both crisp (Boolean, binary) and fuzzy data are supported. It generates conditions in the form of elementary conjunctions, evaluates them on a dataset and checks the induced sub-data for interesting statistical properties. A user-defined function may be defined to evaluate on each generated condition to search for custom patterns.
Maintained by Michal Burda. Last updated 18 days ago.
association-rule-miningcontrast-pattern-miningdata-miningfuzzyknowledge-discoverypattern-recognitioncppopenmp
2 stars 5.38 score 10 scriptsangabrio
missingHE:Missing Outcome Data in Health Economic Evaluation
Contains a suite of functions for health economic evaluations with missing outcome data. The package can fit different types of statistical models under a fully Bayesian approach using the software 'JAGS' (which should be installed locally and which is loaded in 'missingHE' via the 'R' package 'R2jags'). Three classes of models can be fitted under a variety of missing data assumptions: selection models, pattern mixture models and hurdle models. In addition to model fitting, 'missingHE' provides a set of specialised functions to assess model convergence and fit, and to summarise the statistical and economic results using different types of measures and graphs. The methods implemented are described in Mason (2018) <doi:10.1002/hec.3793>, Molenberghs (2000) <doi:10.1007/978-1-4419-0300-6_18> and Gabrio (2019) <doi:10.1002/sim.8045>.
Maintained by Andrea Gabrio. Last updated 2 years ago.
cost-effectiveness-analysishealth-economic-evaluationindividual-level-datajagsmissing-dataparametric-modellingsensitivity-analysiscpp
5 stars 5.38 score 24 scriptsbioc
chevreulProcess:Tools for managing SingleCellExperiment objects as projects
Tools analyzing SingleCellExperiment objects as projects. for input into the Chevreul app downstream. Includes functions for analysis of single cell RNA sequencing data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Maintained by Kevin Stachelek. Last updated 2 months ago.
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
5.38 score 2 scripts 2 dependentsmarce10
dynaSpec:Dynamic Spectrogram Visualizations
A set of tools to generate dynamic spectrogram visualizations in video format.
Maintained by Marcelo Araya-Salas. Last updated 1 months ago.
animal-soundsbioacousticsspectrogram
23 stars 5.37 score 34 scriptstbrown122387
gradeR:Helps Grade Assignment Submissions that are R Scripts
After being given the location of your students' submissions and a test file, the function runs each .R file, and evaluates the results from all the given tests. Results are neatly returned in a data frame that has a row for each student, and a column for each test.
Maintained by Taylor Brown. Last updated 3 years ago.
4 stars 5.36 score 57 scriptsmatteo21q
dani:Design and Analysis of Non-Inferiority Trials
Provides tools to help with the design and analysis of non-inferiority trials. These include functions for doing sample size calculations and for analysing non-inferiority trials, using a variety of outcome types and population-level sumamry measures. It also features functions to make trials more resilient by using the concept of non-inferiority frontiers, as described in Quartagno et al. (2019) <arXiv:1905.00241>. Finally it includes function to design and analyse MAMS-ROCI (aka DURATIONS) trials.
Maintained by Matteo Quartagno. Last updated 7 months ago.
2 stars 5.33 score 27 scriptsbioc
MsQuality:MsQuality - Quality metric calculation from Spectra and MsExperiment objects
The MsQuality provides functionality to calculate quality metrics for mass spectrometry-derived, spectral data at the per-sample level. MsQuality relies on the mzQC framework of quality metrics defined by the Human Proteom Organization-Proteomics Standards Initiative (HUPO-PSI). These metrics quantify the quality of spectral raw files using a controlled vocabulary. The package is especially addressed towards users that acquire mass spectrometry data on a large scale (e.g. data sets from clinical settings consisting of several thousands of samples). The MsQuality package allows to calculate low-level quality metrics that require minimum information on mass spectrometry data: retention time, m/z values, and associated intensities. MsQuality relies on the Spectra package, or alternatively the MsExperiment package, and its infrastructure to store spectral data.
Maintained by Thomas Naake. Last updated 2 months ago.
metabolomicsproteomicsmassspectrometryqualitycontrolmass-spectrometryqc
7 stars 5.32 score 2 scriptshenningte
ir:Functions to Handle and Preprocess Infrared Spectra
Functions to import and handle infrared spectra (import from '.csv' and Thermo Galactic's '.spc', baseline correction, binning, clipping, interpolating, smoothing, averaging, adding, subtracting, dividing, multiplying, plotting).
Maintained by Henning Teickner. Last updated 3 years ago.
chemometricsinfraredinfrared-spectrair-packagemid-infrared-spectraspectroscopy
6 stars 5.32 score 35 scriptsfrankiecho
ahpsurvey:Analytic Hierarchy Process for Survey Data
The Analytic Hierarchy Process is a versatile multi-criteria decision-making tool introduced by Saaty (1987) <doi:10.1016/0270-0255(87)90473-8> that allows decision-makers to weigh attributes and evaluate alternatives presented to them. This package provides a consistent methodology for researchers to reformat data and run analytic hierarchy process in R on data that are formatted using the survey data entry mode. It is optimized for performing the analytic hierarchy process with many decision-makers, and provides tools and options for researchers to aggregate individual preferences and test multiple options. It also allows researchers to quantify, visualize and correct for inconsistency in the decision-maker's comparisons.
Maintained by Frankie Cho. Last updated 4 years ago.
analytic-hierarchy-processoperations-researchquestionnairesurvey-data
14 stars 5.28 score 27 scriptsmazamascience
MazamaTimeSeries:Core Functionality for Environmental Time Series
Utility functions for working with environmental time series data from known locations. The compact data model is structured as a list with two dataframes. A 'meta' dataframe contains spatial and measuring device metadata associated with deployments at known locations. A 'data' dataframe contains a 'datetime' column followed by columns of measurements associated with each "device-deployment". Ephemerides calculations are based on code originally found in NOAA's "Solar Calculator" <https://gml.noaa.gov/grad/solcalc/>.
Maintained by Jonathan Callahan. Last updated 1 years ago.
5.27 score 62 scripts 1 dependentsboennecd
psqn:Partially Separable Quasi-Newton
Provides quasi-Newton methods to minimize partially separable functions. The methods are largely described by Nocedal and Wright (2006) <doi:10.1007/978-0-387-40065-5>.
Maintained by Benjamin Christoffersen. Last updated 6 months ago.
optimizationoptimization-algorithmsquasi-newtonopenblascppopenmp
2 stars 5.26 score 5 scripts 3 dependentsbioc
CNVPanelizer:Reliable CNV detection in targeted sequencing applications
A method that allows for the use of a collection of non-matched normal tissue samples. Our approach uses a non-parametric bootstrap subsampling of the available reference samples to estimate the distribution of read counts from targeted sequencing. As inspired by random forest, this is combined with a procedure that subsamples the amplicons associated with each of the targeted genes. The obtained information allows us to reliably classify the copy number aberrations on the gene level.
Maintained by Thomas Wolf. Last updated 5 months ago.
classificationsequencingnormalizationcopynumbervariationcoverage
5.23 score 12 scriptsbioc
runibic:runibic: row-based biclustering algorithm for analysis of gene expression data in R
This package implements UbiBic algorithm in R. This biclustering algorithm for analysis of gene expression data was introduced by Zhenjia Wang et al. in 2016. It is currently considered the most promising biclustering method for identification of meaningful structures in complex and noisy data.
Maintained by Patryk Orzechowski. Last updated 5 months ago.
microarrayclusteringgeneexpressionsequencingcoveragecppopenmp
4 stars 5.20 score 7 scriptsboennecd
VAJointSurv:Variational Approximation for Joint Survival and Marker Models
Estimates joint marker (longitudinal) and survival (time-to-event) outcomes using variational approximations. The package supports multivariate markers allowing for correlated error terms and multiple types of survival outcomes which may be left-truncated, right-censored, and recurrent. Time-varying fixed and random covariate effects are supported along with non-proportional hazards.
Maintained by Benjamin Christoffersen. Last updated 3 months ago.
5 stars 5.20 score 21 scriptspik-piam
modelstats:Run Analysis Tools
A collection of tools to analyze model runs.
Maintained by Anastasis Giannousakis. Last updated 14 days ago.
1 stars 5.19 score 2 scriptsramikrispin
covid19sf:The Covid19 San Francisco Dataset
Provides a verity of summary tables of the Covid19 cases in San Francisco. Data source: San Francisco, Department of Public Health - Population Health Division <https://datasf.org/opendata/>.
Maintained by Rami Krispin. Last updated 2 years ago.
12 stars 5.16 score 12 scriptslarsenlab
hlaR:Tools for HLA Data
A streamlined tool for eplet analysis of donor and recipient HLA (human leukocyte antigen) mismatch. Messy, low-resolution HLA typing data is cleaned, and imputed to high-resolution using the NMDP (National Marrow Donor Program) haplotype reference database <https://haplostats.org/haplostats>. High resolution data is analyzed for overall or single antigen eplet mismatch using a reference table (currently supporting 'HLAMatchMaker' <http://www.epitopes.net> versions 2 and 3). Data can enter or exit the workflow at different points depending on the user's aims and initial data quality.
Maintained by Joan Zhang. Last updated 2 years ago.
7 stars 5.15 score 9 scriptsmiriamesteve
GSSTDA:Progression Analysis of Disease with Survival using Topological Data Analysis
Mapper-based survival analysis with transcriptomics data is designed to carry out. Mapper-based survival analysis is a modification of Progression Analysis of Disease (PAD) where survival data is taken into account in the filtering function. More details in: J. Fores-Martos, B. Suay-Garcia, R. Bosch-Romeu, M.C. Sanfeliu-Alonso, A. Falco, J. Climent, "Progression Analysis of Disease with Survival (PAD-S) by SurvMap identifies different prognostic subgroups of breast cancer in a large combined set of transcriptomics and methylation studies" <doi:10.1101/2022.09.08.507080>.
Maintained by Miriam Esteve. Last updated 8 months ago.
2 stars 5.15 score 7 scriptschoi-phd
lordif:Logistic Ordinal Regression Differential Item Functioning using IRT
Performs analysis of Differential Item Functioning (DIF) for dichotomous and polytomous items using an iterative hybrid of ordinal logistic regression and item response theory (IRT) according to Choi, Gibbons, and Crane (2011) <doi:10.18637/jss.v039.i08>.
Maintained by Seung W. Choi. Last updated 3 months ago.
1 stars 5.12 score 35 scripts 1 dependentss-fleck
testthis:Utils and 'RStudio' Addins to Make Testing Even More Fun
Utility functions and 'RStudio' addins for writing, running and organizing automated tests. Integrates tightly with the packages 'testthat', 'devtools' and 'usethis'. Hotkeys can be assigned to the 'RStudio' addins for running tests in a single file or to switch between a source file and the associated test file. In addition, testthis provides function to manage and run tests in subdirectories of the test/testthat directory.
Maintained by Stefan Fleck. Last updated 3 years ago.
rstudiorstudio-addinrstudio-addinstestingtestthat
33 stars 5.12 score 20 scriptsjulia-wrobel
mxfda:A Functional Data Analysis Package for Spatial Single Cell Data
Methods and tools for deriving spatial summary functions from single-cell imaging data and performing functional data analyses. Functions can be applied to other single-cell technologies such as spatial transcriptomics. Functional regression and functional principal component analysis methods are in the 'refund' package <https://cran.r-project.org/package=refund> while calculation of the spatial summary functions are from the 'spatstat' package <https://spatstat.org/>.
Maintained by Alex Soupir. Last updated 1 months ago.
1 stars 5.08 score 8 scriptsbioc
chevreulShiny:Tools for managing SingleCellExperiment objects as projects
Tools for managing SingleCellExperiment objects as projects. Includes functions for analysis and visualization of single-cell data. Also included is a shiny app for visualization of pre-processed scRNA data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Maintained by Kevin Stachelek. Last updated 28 days ago.
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
5.08 scoreyanrong-stacy-song
creditr:Credit Default Swaps
Price credit default swaps using 'C' code from the International Swaps and Derivatives Association CDS Standard Model. See <https://www.cdsmodel.com/cdsmodel/documentation.html> for more information about the model and <https://www.cdsmodel.com/cdsmodel/cds-disclaimer.html> for license details for the 'C' code.
Maintained by Yanrong Song. Last updated 8 days ago.
5.05 score 32 scriptsnalimilan
R.temis:Integrated Text Mining Solution
An integrated solution to perform a series of text mining tasks such as importing and cleaning a corpus, and analyses like terms and documents counts, lexical summary, terms co-occurrences and documents similarity measures, graphs of terms, correspondence analysis and hierarchical clustering. Corpora can be imported from spreadsheet-like files, directories of raw text files, as well as from 'Dow Jones Factiva', 'LexisNexis', 'Europresse' and 'Alceste' files.
Maintained by Milan Bouchet-Valat. Last updated 3 days ago.
28 stars 5.00 score 24 scriptsbioc
GARS:GARS: Genetic Algorithm for the identification of Robust Subsets of variables in high-dimensional and challenging datasets
Feature selection aims to identify and remove redundant, irrelevant and noisy variables from high-dimensional datasets. Selecting informative features affects the subsequent classification and regression analyses by improving their overall performances. Several methods have been proposed to perform feature selection: most of them relies on univariate statistics, correlation, entropy measurements or the usage of backward/forward regressions. Herein, we propose an efficient, robust and fast method that adopts stochastic optimization approaches for high-dimensional. GARS is an innovative implementation of a genetic algorithm that selects robust features in high-dimensional and challenging datasets.
Maintained by Mattia Chiesa. Last updated 5 months ago.
classificationfeatureextractionclusteringopenjdk
5.00 score 2 scriptsdieghernan
pkgdev:Helpers to Develop a Package using GitHub Actions
A small set of functions that takes advantage of GitHub Actions for making your life easier as a R package developer. This package is primarily intended for personal use, however feel free to use it (at your own risk :)).
Maintained by Diego Hernangómez. Last updated 7 days ago.
developer-toolsexperimentalgithub-actions
4 stars 5.00 score 7 scriptsbioc
broadSeq:broadSeq : for streamlined exploration of RNA-seq data
This package helps user to do easily RNA-seq data analysis with multiple methods (usually which needs many different input formats). Here the user will provid the expression data as a SummarizedExperiment object and will get results from different methods. It will help user to quickly evaluate different methods.
Maintained by Rishi Das Roy. Last updated 5 months ago.
geneexpressiondifferentialexpressionrnaseqtranscriptomicssequencingcoveragegenesetenrichmentgo
4 stars 5.00 score 7 scriptscran
exactci:Exact P-Values and Matching Confidence Intervals for Simple Discrete Parametric Cases
Calculates exact tests and confidence intervals for one-sample binomial and one- or two-sample Poisson cases (see Fay (2010) <doi:10.32614/rj-2010-008>).
Maintained by Michael P. Fay. Last updated 2 years ago.
5.00 score 10 dependentsalphaprime7
normfluodbf:Cleans and Normalizes FLUOstar DBF and DAT Files from 'Liposome' Flux Assays
Cleans and Normalizes FLUOstar DBF and DAT Files obtained from liposome flux assays. Users should verify extended usage of the package on files from other assay types.
Maintained by Tingwei Adeck. Last updated 5 months ago.
1 stars 4.98 score 12 scriptspboutros
bedr:Genomic Region Processing using Tools Such as 'BEDTools', 'BEDOPS' and 'Tabix'
Genomic regions processing using open-source command line tools such as 'BEDTools', 'BEDOPS' and 'Tabix'. These tools offer scalable and efficient utilities to perform genome arithmetic e.g indexing, formatting and merging. bedr API enhances access to these tools as well as offers additional utilities for genomic regions processing.
Maintained by Paul C. Boutros. Last updated 6 years ago.
4.98 score 264 scripts 2 dependentsdzhakparov
GeneSelectR:Comprehensive Feature Selection Worfkflow for Bulk RNAseq Datasets
GeneSelectR is a versatile R package designed for efficient RNA sequencing data analysis. Its key innovation lies in the seamless integration of the Python sklearn machine learning framework with R-based bioinformatics tools. This integration enables GeneSelectR to perform robust ML-driven feature selection while simultaneously leveraging the power of Gene Ontology (GO) enrichment and semantic similarity analyses. By combining these diverse methodologies, GeneSelectR offers a comprehensive workflow that optimizes both the computational aspects of ML and the biological insights afforded by advanced bioinformatics analyses. Ideal for researchers in bioinformatics, GeneSelectR stands out as a unique tool for analyzing complex RNAseq datasets with enhanced precision and relevance.
Maintained by Damir Zhakparov. Last updated 10 months ago.
19 stars 4.98 score 7 scriptsryan-riggs
RivRetrieve:Retrieve Global River Gauge Data
Provides access to global river gauge data from a variety of national-level river agencies. The package interfaces with the national-level agency websites to provide access to river gauge locations, river discharge, and river stage. Currently, the package is available for the following countries: Australia, Brazil, Canada, Chile, France, Japan, South Africa, the United Kingdom, and the United States.
Maintained by Ryan Riggs. Last updated 3 months ago.
9 stars 4.95 score 7 scriptslcrawlab
smer:Sparse Marginal Epistasis Test
The Sparse Marginal Epistasis Test is a computationally efficient genetics method which detects statistical epistasis in complex traits; see Stamp et al. (2025, <doi:10.1101/2025.01.11.632557>) for details.
Maintained by Julian Stamp. Last updated 2 months ago.
genomewideassociationepistasisgeneticssnplinearmixedmodelcppepistasis-analysisepistatisgwasgwas-toolsmapitzlibcppopenmp
1 stars 4.95 score 8 scriptsopenwashdata
washr:Publication Toolkit for Water, Sanitation and Hygiene (WASH) Data
A toolkit to set up an R data package in a consistent structure. Automates tasks like tidy data export, data dictionary documentation, README and website creation, and citation management.
Maintained by Colin Walder. Last updated 5 months ago.
2 stars 4.95 score 7 scriptsqile0317
FastUtils:Fast, Readable Utility Functions
A wide variety of tools for general data analysis, wrangling, spelling, statistics, visualizations, package development, and more. All functions have vectorized implementations whenever possible. Exported names are designed to be readable, with longer names possessing short aliases.
Maintained by Qile Yang. Last updated 4 months ago.
scientific-computingutilitiesutilitycpp
2 stars 4.95 score 2 scriptsarmcn
quickcheck:Property Based Testing
Property based testing, inspired by the original 'QuickCheck'. This package builds on the property based testing framework provided by 'hedgehog' and is designed to seamlessly integrate with 'testthat'.
Maintained by Andrew McNeil. Last updated 1 years ago.
functional-programmingproperty-based-testing
25 stars 4.94 score 70 scriptstpisel
openmeteo:Retrieve Weather Data from the Open-Meteo API
A client for the Open-Meteo API that retrieves Open-Meteo weather data in a tidy format. No API key is required. The API specification is located at <https://open-meteo.com/en/docs>.
Maintained by Tom Pisel. Last updated 1 years ago.
20 stars 4.93 score 86 scriptsbioc
Oscope:Oscope - A statistical pipeline for identifying oscillatory genes in unsynchronized single cell RNA-seq
Oscope is a statistical pipeline developed to identifying and recovering the base cycle profiles of oscillating genes in an unsynchronized single cell RNA-seq experiment. The Oscope pipeline includes three modules: a sine model module to search for candidate oscillator pairs; a K-medoids clustering module to cluster candidate oscillators into groups; and an extended nearest insertion module to recover the base cycle order for each oscillator group.
Maintained by Ning Leng. Last updated 5 months ago.
immunooncologystatisticalmethodrnaseqsequencinggeneexpression
4.92 score 14 scripts 1 dependentslukeduttweiler
skipTrack:A Bayesian Hierarchical Model that Controls for Non-Adherence in Mobile Menstrual Cycle Tracking
Implements a Bayesian hierarchical model designed to identify skips in mobile menstrual cycle self-tracking on mobile apps. Future developments will allow for the inclusion of covariates affecting cycle mean and regularity, as well as extra information regarding tracking non-adherence. Main methods to be outlined in a forthcoming paper, with alternative models from Li et al. (2022) <doi:10.1093/jamia/ocab182>.
Maintained by Luke Duttweiler. Last updated 2 months ago.
4.90 score 4 scriptsschlosslab
clustur:Clustering
A tool that implements the clustering algorithms from 'mothur' (Schloss PD et al. (2009) <doi:10.1128/AEM.01541-09>). 'clustur' make use of the cluster() and make.shared() command from 'mothur'. Our cluster() function has five different algorithms implemented: 'OptiClust', 'furthest', 'nearest', 'average', and 'weighted'. 'OptiClust' is an optimized clustering method for Operational Taxonomic Units, and you can learn more here, (Westcott SL, Schloss PD (2017) <doi:10.1128/mspheredirect.00073-17>). The make.shared() command is always applied at the end of the clustering command. This functionality allows us to generate and create clustering and abundance data efficiently.
Maintained by Patrick Schloss. Last updated 4 months ago.
1 stars 4.85 score 7 scriptsmazamascience
MazamaSpatialPlots:Thematic Plots for Mazama Spatial Datasets
A suite of convenience functions for generating US state and county thematic maps using datasets from the MazamaSpatialUtils package.
Maintained by Jonathan Callahan. Last updated 2 months ago.
4.84 score 23 scriptsbioc
MLSeq:Machine Learning Interface for RNA-Seq Data
This package applies several machine learning methods, including SVM, bagSVM, Random Forest and CART to RNA-Seq data.
Maintained by Gokmen Zararsiz. Last updated 5 months ago.
immunooncologysequencingrnaseqclassificationclustering
4.81 score 27 scripts 1 dependentsmrcieu
gwasglue:GWAS summary data sources connected to analytical tools
Many tools exist that use GWAS summary data for colocalisation, fine mapping, Mendelian randomization, visualisation, etc. This package is a conduit that connects R packages that can retrieve GWAS summary data to various tools for analysing those data.
Maintained by Gibran Hemani. Last updated 3 years ago.
134 stars 4.79 score 91 scriptsmkearney
pkgverse:Build a Meta-Package Universe
Build your own universe of packages similar to the 'tidyverse' package <https://tidyverse.org/> with this meta-package creator. Create a package-verse, or meta package, by supplying a custom name for the collection of packages and the vector of desired package names to include– and optionally supply a destination directory, an indicator of whether to keep the created package directory, and/or a vector of verbs implement via the 'usethis' <http://usethis.r-lib.org/> package.
Maintained by Michael Wayne Kearney. Last updated 6 years ago.
121 stars 4.78 score 6 scriptsocbe-uio
BayesSurvive:Bayesian Survival Models for High-Dimensional Data
An implementation of Bayesian survival models with graph-structured selection priors for sparse identification of omics features predictive of survival (Madjar et al., 2021 <doi:10.1186/s12859-021-04483-z>) and its extension to use a fixed graph via a Markov Random Field (MRF) prior for capturing known structure of omics features, e.g. disease-specific pathways from the Kyoto Encyclopedia of Genes and Genomes database (Hermansen et al., 2025 <doi:10.48550/arXiv.2503.13078>).
Maintained by Zhi Zhao. Last updated 11 days ago.
bayesian-cox-modelsbayesian-variable-selectiongraph-learninghigh-dimensional-statisticsomics-data-integrationsurvival-analysisopenblascppopenmp
4.78 score 1 scriptssimulatr
simrel:Simulation of Multivariate Linear Model Data
Researchers have been using simulated data from a multivariate linear model to compare and evaluate different methods, ideas and models. Additionally, teachers and educators have been using a simulation tool to demonstrate and teach various statistical and machine learning concepts. This package helps users to simulate linear model data with a wide range of properties by tuning few parameters such as relevant latent components. In addition, a shiny app as an 'RStudio' gadget gives users a simple interface for using the simulation function. See more on: Sæbø, S., Almøy, T., Helland, I.S. (2015) <doi:10.1016/j.chemolab.2015.05.012> and Rimal, R., Almøy, T., Sæbø, S. (2018) <doi:10.1016/j.chemolab.2018.02.009>.
Maintained by Raju Rimal. Last updated 2 years ago.
bivariate-simulationmultivariate-simulationrelevant-predictor-componentssimulated-datasimulationunivariate-simulation
3 stars 4.78 score 40 scriptsmkorvink
archetyper:An Archetype for Data Mining and Data Science Projects
A project template to support the data science workflow.
Maintained by Michael Korvink. Last updated 4 years ago.
6 stars 4.78 score 7 scriptsvgherard
sbo:Text Prediction via Stupid Back-Off N-Gram Models
Utilities for training and evaluating text predictors based on Stupid Back-Off N-gram models (Brants et al., 2007, <https://www.aclweb.org/anthology/D07-1090/>).
Maintained by Valerio Gherardi. Last updated 4 years ago.
natural-language-processingngram-modelspredictive-textsbocpp
10 stars 4.78 score 12 scriptsbioc
biodbChebi:biodbChebi, a library for connecting to the ChEBI Database
The biodbChebi library provides access to the ChEBI Database, using biodb package framework. It allows to retrieve entries by their accession number. Web services can be accessed for searching the database by name, mass or other fields.
Maintained by Pierrick Roger. Last updated 5 months ago.
softwareinfrastructuredataimport
2 stars 4.78 score 3 scripts 1 dependentsbioc
mobileRNA:mobileRNA: Investigate the RNA mobilome & population-scale changes
Genomic analysis can be utilised to identify differences between RNA populations in two conditions, both in production and abundance. This includes the identification of RNAs produced by multiple genomes within a biological system. For example, RNA produced by pathogens within a host or mobile RNAs in plant graft systems. The mobileRNA package provides methods to pre-process, analyse and visualise the sRNA and mRNA populations based on the premise of mapping reads to all genotypes at the same time.
Maintained by Katie Jeynes-Cupper. Last updated 5 months ago.
visualizationrnaseqsequencingsmallrnagenomeassemblyclusteringexperimentaldesignqualitycontrolworkflowstepalignmentpreprocessingbioinformaticsplant-science
3 stars 4.78 score 2 scripts