R-universe search: needs:readxl

tidyverse

tidyverse:Easily Install and Load the 'Tidyverse'

The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at <https://www.tidyverse.org>.

Maintained by Hadley Wickham. Last updated 5 months ago.

data-science tidyverse

1.7k stars 20.23 score 664k scripts 125 dependents

gesistsa

rio:A Swiss-Army Knife for Data I/O

Streamlined data import and export by making assumptions that the user is probably willing to make: 'import()' and 'export()' determine the data format from the file extension, reasonable defaults are used for data import and export, web-based import is natively supported (including from SSL/HTTPS), compressed files can be read directly, and fast import packages are used where appropriate. An additional convenience function, 'convert()', provides a simple method for converting between file types.

Maintained by Chung-hong Chan. Last updated 3 months ago.

csv csvy data data-science excel io rio sas spss stata

610 stars 17.10 score 7.8k scripts 74 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 1 days ago.

fortran cpp

86 stars 16.73 score 7.7k scripts 101 dependents

business-science

tidyquant:Tidy Quantitative Financial Analysis

Bringing business and financial analysis to the 'tidyverse'. The 'tidyquant' package provides a convenient wrapper to various 'xts', 'zoo', 'quantmod', 'TTR' and 'PerformanceAnalytics' package functions and returns the objects in the tidy 'tibble' format. The main advantage is being able to use quantitative functions with the 'tidyverse' functions including 'purrr', 'dplyr', 'tidyr', 'ggplot2', 'lubridate', etc. See the 'tidyquant' website for more information, documentation and examples.

Maintained by Matt Dancho. Last updated 1 months ago.

dplyr financial-analysis financial-data financial-statements multiple-stocks performance-analysis performanceanalytics quantmod stock stock-exchanges stock-indexes stock-lists stock-performance stock-prices stock-symbol tidyverse time-series timeseries xts

872 stars 13.34 score 5.2k scripts

dreamrs

esquisse:Explore and Visualize Your Data Interactively

A 'shiny' gadget to create 'ggplot2' figures interactively with drag-and-drop to map your variables to different aesthetics. You can quickly visualize your data accordingly to their type, export in various formats, and retrieve the code to reproduce the plot.

Maintained by Victor Perrier. Last updated 1 months ago.

addin data-visualization ggplot2 rstudio-addin visualization

1.8k stars 13.31 score 1.1k scripts 1 dependents

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 15 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

109 stars 13.20 score 342 scripts 3 dependents

massimoaria

bibliometrix:Comprehensive Science Mapping Analysis

Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.

Maintained by Massimo Aria. Last updated 10 days ago.

bibliometric-analysis bibliometrics citation citation-network citations co-authors co-occurence co-word-analysis correspondence-analysis coupling isi-web journal manuscript quantitative-analysis scholars science science-mapping scientific scientometrics scopus

545 stars 12.54 score 518 scripts 2 dependents

dreamrs

datamods:Modules to Import and Manipulate Data in 'Shiny'

'Shiny' modules to import data into an application or 'addin' from various sources, and to manipulate them after that.

Maintained by Victor Perrier. Last updated 24 days ago.

shiny shiny-modules

144 stars 12.03 score 174 scripts 7 dependents

pecanproject

PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.

Maintained by David LeBauer. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 11.61 score 64 scripts 14 dependents

bioc

mia:Microbiome analysis

mia implements tools for microbiome analysis based on the SummarizedExperiment, SingleCellExperiment and TreeSummarizedExperiment infrastructure. Data wrangling and analysis in the context of taxonomic data is the main scope. Additional functions for common task are implemented such as community indices calculation and summarization.

Maintained by Tuomas Borman. Last updated 2 days ago.

microbiome software dataimport analysis bioconductor cpp

51 stars 11.51 score 316 scripts 5 dependents

jamiemkass

ENMeval:Automated Tuning and Evaluations of Ecological Niche Models

Runs ecological niche models over all combinations of user-defined settings (i.e., tuning), performs cross validation to evaluate models, and returns data tables to aid in selection of optimal model settings that balance goodness-of-fit and model complexity. Also has functions to partition data spatially (or not) for cross validation, to plot multiple visualizations of results, to run null models to estimate significance and effect sizes of performance metrics, and to calculate range overlap between model predictions, among others. The package was originally built for Maxent models (Phillips et al. 2006, Phillips et al. 2017), but the current version allows possible extensions for any modeling algorithm. The extensive vignette, which guides users through most package functionality but unfortunately has a file size too big for CRAN, can be found here on the package's Github Pages website: <https://jamiemkass.github.io/ENMeval/articles/ENMeval-2.0-vignette.html>.

Maintained by Jamie M. Kass. Last updated 17 hours ago.

49 stars 11.16 score 332 scripts 2 dependents

covid19datahub

COVID19:COVID-19 Data Hub

Unified datasets for a better understanding of COVID-19.

Maintained by Emanuele Guidotti. Last updated 1 months ago.

2019-ncov coronavirus covid-19 covid-data covid19-data

252 stars 11.08 score 265 scripts

ropengov

eurostat:Tools for Eurostat Open Data

Tools to download data from the Eurostat database <https://ec.europa.eu/eurostat> together with search and manipulation utilities.

Maintained by Leo Lahti. Last updated 1 months ago.

ropengov eurostat eurostat-data

242 stars 11.07 score 892 scripts 4 dependents

friendly

vcdExtra:'vcd' Extensions and Additions

Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.

Maintained by Michael Friendly. Last updated 5 days ago.

categorical-data-visualization generalized-linear-models mosaic-plots

24 stars 10.85 score 472 scripts 3 dependents

bioc

ANCOMBC:Microbiome differential abudance and correlation analyses with bias correction

ANCOMBC is a package containing differential abundance (DA) and correlation analyses for microbiome data. Specifically, the package includes Analysis of Compositions of Microbiomes with Bias Correction 2 (ANCOM-BC2), Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC), and Analysis of Composition of Microbiomes (ANCOM) for DA analysis, and Sparse Estimation of Correlations among Microbiomes (SECOM) for correlation analysis. Microbiome data are typically subject to two sources of biases: unequal sampling fractions (sample-specific biases) and differential sequencing efficiencies (taxon-specific biases). Methodologies included in the ANCOMBC package are designed to correct these biases and construct statistically consistent estimators.

Maintained by Huang Lin. Last updated 13 days ago.

differentialexpression microbiome normalization sequencing software ancom ancombc ancombc2 correlation differential-abundance-analysis secom

120 stars 10.79 score 406 scripts 1 dependents

quanteda

readtext:Import and Handling for Plain and Formatted Text Files

Functions for importing and handling text files and formatted text files with additional meta-data, such including '.csv', '.tab', '.json', '.xml', '.html', '.pdf', '.doc', '.docx', '.rtf', '.xls', '.xlsx', and others.

Maintained by Kenneth Benoit. Last updated 4 months ago.

encoding quanteda text

122 stars 10.66 score 1.2k scripts 5 dependents

bcgov

bcdata:Search and Retrieve Data from the BC Data Catalogue

Search, query, and download tabular and 'geospatial' data from the British Columbia Data Catalogue (<https://catalogue.data.gov.bc.ca/>). Search catalogue data records based on keywords, data licence, sector, data format, and B.C. government organization. View metadata directly in R, download many data formats, and query 'geospatial' data available via the B.C. government Web Feature Service ('WFS') using 'dplyr' syntax.

Maintained by Andy Teucher. Last updated 3 days ago.

bcdc citz data-science env

83 stars 10.36 score 186 scripts 4 dependents

idigbio

ridigbio:Interface to the iDigBio Data API

An interface to iDigBio's search API that allows downloading specimen records. Searches are returned as a data.frame. Other functions such as the metadata end points return lists of information. iDigBio is a US project focused on digitizing and serving museum specimen collections on the web. See <https://www.idigbio.org> for information on iDigBio.

Maintained by Jesse Bennett. Last updated 18 days ago.

16 stars 10.23 score 63 scripts 7 dependents

ropensci

spocc:Interface to Species Occurrence Data Sources

A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility ('GBIF'), 'iNaturalist', 'eBird', Integrated Digitized 'Biocollections' ('iDigBio'), 'VertNet', Ocean 'Biogeographic' Information System ('OBIS'), and Atlas of Living Australia ('ALA'). Includes functionality for retrieving species occurrence data, and combining those data.

Maintained by Hannah Owens. Last updated 2 months ago.

specimens api web-services occurrences species taxonomy gbif inat vertnet ebird idigbio obis ala antweb bison data ecoengine inaturalist occurrence species-occurrence spocc

118 stars 10.09 score 552 scripts 5 dependents

pecanproject

PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Istem Fer. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.96 score 20 scripts 2 dependents

bioc

OmnipathR:OmniPath web service client and more

A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).

Maintained by Denes Turei. Last updated 1 months ago.

graphandnetwork network pathways software thirdpartyclient dataimport datarepresentation genesignaling generegulation systemsbiology transcriptomics singlecell annotation kegg complexes enzyme-ptm networks networks-biology omnipath proteins quarto

130 stars 9.90 score 226 scripts 2 dependents

pecanproject

PEcAnRTM:PEcAn Functions Used for Radiative Transfer Modeling

Functions for performing forward runs and inversions of radiative transfer models (RTMs). Inversions can be performed using maximum likelihood, or more complex hierarchical Bayesian methods. Underlying numerical analyses are optimized for speed using Fortran code.

Maintained by Alexey Shiklomanov. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants fortran jags cpp

216 stars 9.70 score 132 scripts

olink-proteomics

OlinkAnalyze:Facilitate Analysis of Proteomic Data from Olink

A collection of functions to facilitate analysis of proteomic data from Olink, primarily NPX data that has been exported from Olink Software. The functions also work on QUANT data from Olink by log- transforming the QUANT data. The functions are focused on reading data, facilitating data wrangling and quality control analysis, performing statistical analysis and generating figures to visualize the results of the statistical analysis. The goal of this package is to help users extract biological insights from proteomic data run on the Olink platform.

Maintained by Kathleen Nevola. Last updated 1 months ago.

olink proteomics proteomics-data-analysis

107 stars 9.67 score 61 scripts

jamovi

jmv:The 'jamovi' Analyses

A suite of common statistical methods such as descriptives, t-tests, ANOVAs, regression, correlation matrices, proportion tests, contingency tables, and factor analysis. This package is also useable from the 'jamovi' statistical spreadsheet (see <https://www.jamovi.org> for more information).

Maintained by Jonathon Love. Last updated 27 days ago.

59 stars 9.58 score 440 scripts

immunomind

immunarch:Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires

A comprehensive framework for bioinformatics exploratory analysis of bulk and single-cell T-cell receptor and antibody repertoires. It provides seamless data loading, analysis and visualisation for AIRR (Adaptive Immune Receptor Repertoire) data, both bulk immunosequencing (RepSeq) and single-cell sequencing (scRNAseq). Immunarch implements most of the widely used AIRR analysis methods, such as: clonality analysis, estimation of repertoire similarities in distribution of clonotypes and gene segments, repertoire diversity analysis, annotation of clonotypes using external immune receptor databases and clonotype tracking in vaccination and cancer studies. A successor to our previously published 'tcR' immunoinformatics package (Nazarov 2015) <doi:10.1186/s12859-015-0613-1>.

Maintained by Vadim I. Nazarov. Last updated 1 years ago.

airr-analysis b-cell-receptor bcr bcr-repertoire bioinformatics ig ig-repertoire immune-repertoire immune-repertoire-analysis immune-repertoire-data immunoglobulin immunoinformatics immunology rep-seq repertoire-analysis single-cell single-cell-analysis t-cell-receptor tcr tcr-repertoire cpp

316 stars 9.49 score 203 scripts

john-d-fox

Rcmdr:R Commander

A platform-independent basic-statistics GUI (graphical user interface) for R, based on the tcltk package.

Maintained by John Fox. Last updated 5 months ago.

4 stars 9.48 score 636 scripts 38 dependents

cmmr

rbiom:Read/Write, Analyze, and Visualize 'BIOM' Data

A toolkit for working with Biological Observation Matrix ('BIOM') files. Read/write all 'BIOM' formats. Compute rarefaction, alpha diversity, and beta diversity (including 'UniFrac'). Summarize counts by taxonomic level. Subset based on metadata. Generate visualizations and statistical analyses. CPU intensive operations are coded in C for speed.

Maintained by Daniel P. Smith. Last updated 11 days ago.

15 stars 9.07 score 117 scripts 6 dependents

bioc

BatchQC:Batch Effects Quality Control Software

Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.

Maintained by Jessica Anderson. Last updated 11 days ago.

batcheffect graphandnetwork microarray normalization principalcomponent sequencing software visualization qualitycontrol rnaseq preprocessing differentialexpression immunooncology

7 stars 9.06 score 54 scripts

pecanproject

PEcAn.all:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.00 score 266 scripts

pecanproject

PEcAn.MAAT:PEcAn Package for Integration of the MAAT Model

This module provides functions to wrap the MAAT model into the PEcAn workflows.

Maintained by Shawn Serbin. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 8.96 score 12 scripts

pecanproject

PEcAn.BIOCRO:PEcAn Package for Integration of the BioCro Model

This module provides functions to link BioCro to PEcAn.

Maintained by David LeBauer. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.94 score 23 scripts

bluefoxr

COINr:Composite Indicator Construction and Analysis

A comprehensive high-level package, for composite indicator construction and analysis. It is a "development environment" for composite indicators and scoreboards, which includes utilities for construction (indicator selection, denomination, imputation, data treatment, normalisation, weighting and aggregation) and analysis (multivariate analysis, correlation plotting, short cuts for principal component analysis, global sensitivity analysis, and more). A composite indicator is completely encapsulated inside a single hierarchical list called a "coin". This allows a fast and efficient work flow, as well as making quick copies, testing methodological variations and making comparisons. It also includes many plotting options, both statistical (scatter plots, distribution plots) as well as for presenting results.

Maintained by William Becker. Last updated 2 months ago.

26 stars 8.94 score 73 scripts 1 dependents

pik-piam

remind2:The REMIND R package (2nd generation)

Contains the REMIND-specific routines for data and model output manipulation.

Maintained by Renato Rodrigues. Last updated 1 days ago.

8.87 score 161 scripts 5 dependents

mattcowgill

readabs:Download and Tidy Time Series Data from the Australian Bureau of Statistics

Downloads, imports, and tidies time series data from the Australian Bureau of Statistics <https://www.abs.gov.au/>.

Maintained by Matt Cowgill. Last updated 28 days ago.

abs australia australian-bureau-of-statistics australian-data statistics tidy-data time-series

104 stars 8.85 score 180 scripts

pecanproject

PEcAn.workflow:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides workhorse functions that can be used to run the major steps of a PEcAn analysis.

Maintained by David LeBauer. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.83 score 15 scripts 4 dependents

ropengov

regions:Processing Regional Statistics

Validating sub-national statistical typologies, re-coding across standard typologies of sub-national statistics, and making valid aggregate level imputation, re-aggregation, re-weighting and projection down to lower hierarchical levels to create meaningful data panels and time series.

Maintained by Daniel Antal. Last updated 3 years ago.

observatory regions ropengov statistics

12 stars 8.81 score 67 scripts 5 dependents

pecanproject

PEcAn.ED2:PEcAn Package for Integration of ED2 Model

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides functions to link the Ecosystem Demography Model, version 2, to PEcAn.

Maintained by Mike Dietze. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.74 score 145 scripts

jinseob2kim

jsmodule:'RStudio' Addins and 'Shiny' Modules for Medical Research

'RStudio' addins and 'Shiny' modules for descriptive statistics, regression and survival analysis.

Maintained by Jinseob Kim. Last updated 11 days ago.

medical rstudio-addins shiny shiny-modules statistics

21 stars 8.69 score 61 scripts

bioc

miaViz:Microbiome Analysis Plotting and Visualization

The miaViz package implements functions to visualize TreeSummarizedExperiment objects especially in the context of microbiome analysis. Part of the mia family of R/Bioconductor packages.

Maintained by Tuomas Borman. Last updated 10 days ago.

microbiome software visualization bioconductor microbiome-analysis plotting

10 stars 8.67 score 81 scripts 1 dependents

r-box

boxr:Interface for the 'Box.com API'

An R interface for the remote file hosting service 'Box' (<https://www.box.com/>). In addition to uploading and downloading files, this package includes functions which mirror base R operations for local files, (e.g. box_load(), box_save(), box_read(), box_setwd(), etc.), as well as 'git' style functions for entire directories (e.g. box_fetch(), box_push()).

Maintained by Ian Lyttle. Last updated 12 months ago.

63 stars 8.65 score 238 scripts

bcgov

bcmaps:Map Layers and Spatial Utilities for British Columbia

Various layers of B.C., including administrative boundaries, natural resource management boundaries, census boundaries etc. All layers are available in BC Albers (<https://spatialreference.org/ref/epsg/3005/>) equal-area projection, which is the B.C. government standard. The layers are sourced from the British Columbia and Canadian government under open licenses, including B.C. Data Catalogue (<https://data.gov.bc.ca>), the Government of Canada Open Data Portal (<https://open.canada.ca/en/using-open-data>), and Statistics Canada (<https://www.statcan.gc.ca/en/reference/licence>).

Maintained by Andy Teucher. Last updated 3 months ago.

data-science env

73 stars 8.65 score 254 scripts

bioc

lefser:R implementation of the LEfSE method for microbiome biomarker discovery

lefser is the R implementation of the popular microbiome biomarker discovery too, LEfSe. It uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers from two-level classes (and optional sub-classes).

Maintained by Sehyun Oh. Last updated 1 months ago.

software sequencing differentialexpression microbiome statisticalmethod classification bioconductor-package r01ca230551

56 stars 8.44 score 56 scripts

ropensci

eph:Argentina's Permanent Household Survey Data and Manipulation Utilities

Tools to download and manipulate the Permanent Household Survey from Argentina (EPH is the Spanish acronym for Permanent Household Survey). e.g: get_microdata() for downloading the datasets, get_poverty_lines() for downloading the official poverty baskets, calculate_poverty() for the calculation of stating if a household is in poverty or not, following the official methodology. organize_panels() is used to concatenate observations from different periods, and organize_labels() adds the official labels to the data. The implemented methods are based on INDEC (2016) <http://www.estadistica.ec.gba.gov.ar/dpe/images/SOCIEDAD/EPH_metodologia_22_pobreza.pdf>. As this package works with the argentinian Permanent Household Survey and its main audience is from this country, the documentation was written in Spanish.

Maintained by Carolina Pradier. Last updated 8 months ago.

eph indec mercado-de-trabajo rstatses

59 stars 8.38 score 255 scripts

nlmixr2

nlmixr2:Nonlinear Mixed Effects Models in Population PK/PD

Fit and compare nonlinear mixed-effects models in differential equations with flexible dosing information commonly seen in pharmacokinetics and pharmacodynamics (Almquist, Leander, and Jirstrand 2015 <doi:10.1007/s10928-015-9409-1>). Differential equation solving is by compiled C code provided in the 'rxode2' package (Wang, Hallow, and James 2015 <doi:10.1002/psp4.12052>).

Maintained by Matthew Fidler. Last updated 1 months ago.

52 stars 8.38 score 120 scripts 3 dependents

pecanproject

PEcAn.SIPNET:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Mike Dietze. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.36 score 61 scripts

chuhousen

amerifluxr:Interface to 'AmeriFlux' Data Services

Programmatic interface to the 'AmeriFlux' database (<https://ameriflux.lbl.gov/>). Provide query, download, and data summary tools.

Maintained by Housen Chu. Last updated 3 months ago.

ameriflux api carbon-flux data time-series

22 stars 8.36 score 29 scripts 15 dependents

wallaceecomod

wallace:A Modular Platform for Reproducible Modeling of Species Niches and Distributions

The 'shiny' application Wallace is a modular platform for reproducible modeling of species niches and distributions. Wallace guides users through a complete analysis, from the acquisition of species occurrence and environmental data to visualizing model predictions on an interactive map, thus bundling complex workflows into a single, streamlined interface. An extensive vignette, which guides users through most package functionality can be found on the package's GitHub Pages website: <https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html>.

Maintained by Mary E. Blair. Last updated 22 days ago.

openjdk

133 stars 8.36 score 96 scripts

dbosak01

libr:Libraries, Data Dictionaries, and a Data Step for R

Contains a set of functions to create data libraries, generate data dictionaries, and simulate a data step. The libname() function will load a directory of data into a library in one line of code. The dictionary() function will generate data dictionaries for individual data frames or an entire library. And the datestep() function will perform row-by-row data processing.

Maintained by David Bosak. Last updated 3 months ago.

cpp

27 stars 8.27 score 48 scripts 2 dependents

pik-piam

quitte:Bits and pieces of code to use with quitte-style data frames

A collection of functions for easily dealing with quitte-style data frames, doing multi-model comparisons and plots.

Maintained by Falk Benke. Last updated 4 days ago.

8.26 score 184 scripts 35 dependents

radiant-rstats

radiant.data:Data Menu for Radiant: Business Analytics using R and Shiny

The Radiant Data menu includes interfaces for loading, saving, viewing, visualizing, summarizing, transforming, and combining data. It also contains functionality to generate reproducible reports of the analyses conducted in the application.

Maintained by Vincent Nijs. Last updated 5 months ago.

53 stars 8.25 score 146 scripts 6 dependents

safetygraphics

safetyGraphics:Interactive Graphics for Monitoring Clinical Trial Safety

A framework for evaluation of clinical trial safety. Users can interactively explore their data using the included 'Shiny' application.

Maintained by Jeremy Wildfire. Last updated 2 years ago.

99 stars 8.19 score 111 scripts

pecanproject

PEcAnAssimSequential:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Mike Dietze. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.12 score 35 scripts

pik-piam

mip:Comparison of multi-model runs

Package contains generic functions to produce comparison plots of multi-model runs.

Maintained by David Klein. Last updated 4 days ago.

1 stars 8.07 score 70 scripts 21 dependents

radiant-rstats

radiant:Business Analytics using R and Shiny

A platform-independent browser-based interface for business analytics in R, based on the shiny package. The application combines the functionality of 'radiant.data', 'radiant.design', 'radiant.basics', 'radiant.model', and 'radiant.multivariate'.

Maintained by Vincent Nijs. Last updated 10 months ago.

460 stars 8.02 score 228 scripts

salvatoremangiafico

rcompanion:Functions to Support Extension Education Program Evaluation

Functions and datasets to support Summary and Analysis of Extension Program Evaluation in R, and An R Companion for the Handbook of Biological Statistics. Vignettes are available at <https://rcompanion.org>.

Maintained by Salvatore Mangiafico. Last updated 1 months ago.

4 stars 8.01 score 2.4k scripts 5 dependents

epiverse-trace

sivirep:Data Wrangling and Automated Reports from 'SIVIGILA' Source

Data wrangling, pre-processing, and generating automated reports from Colombia's epidemiological surveillance system, 'SIVIGILA' <https://portalsivigila.ins.gov.co/>. It provides a customizable R Markdown template for analysis and automatic generation of epidemiological reports that can be adapted to local, regional, and national contexts. This tool offers a standardized and reproducible workflow that helps to reduce manual labor and potential errors in report generation, improving their efficiency and consistency.

Maintained by Geraldine Gómez-Millán. Last updated 1 months ago.

colombia epidemiological-surveillance epiverse public-health

42 stars 8.00 score 21 scripts

atorus-research

metacore:A Centralized Metadata Object Focus on Clinical Trial Data Programming Workflows

Create an immutable container holding metadata for the purpose of better enabling programming activities and functionality of other packages within the clinical programming workflow.

Maintained by Christina Fillmore. Last updated 11 months ago.

35 stars 7.99 score 133 scripts 1 dependents

pharmaverse

tidytlg:Create TLGs using the 'tidyverse'

Generate tables, listings, and graphs (TLG) using 'tidyverse.' Tables can be created functionally, using a standard TLG process, or by specifying table and column metadata to create generic analysis summaries. The 'envsetup' package can also be leveraged to create environments for table creation.

Maintained by Konrad Pagacz. Last updated 9 months ago.

33 stars 7.96 score 22 scripts

john-harrold

formods:'Shiny' Modules for General Tasks

'Shiny' apps can often make use of the same key elements, this package provides modules for common tasks (data upload, wrangling data, figure generation and saving the app state), and also a framework for developing. These modules can react and interact as well as generate code to create reproducible analyses.

Maintained by John Harrold. Last updated 19 days ago.

8 stars 7.94 score 100 scripts 1 dependents

pik-piam

magpie4:MAgPIE outputs R package for MAgPIE version 4.x

Common output routines for extracting results from the MAgPIE framework (versions 4.x).

Maintained by Benjamin Leon Bodirsky. Last updated 1 days ago.

2 stars 7.89 score 254 scripts 9 dependents

psychbruce

bruceR:Broadly Useful Convenient and Efficient R Functions

Broadly useful convenient and efficient R functions that bring users concise and elegant R data analyses. This package includes easy-to-use functions for (1) basic R programming (e.g., set working directory to the path of currently opened file; import/export data from/to files in any format; print tables to Microsoft Word); (2) multivariate computation (e.g., compute scale sums/means/... with reverse scoring); (3) reliability analyses and factor analyses; (4) descriptive statistics and correlation analyses; (5) t-test, multi-factor analysis of variance (ANOVA), simple-effect analysis, and post-hoc multiple comparison; (6) tidy report of statistical models (to R Console and Microsoft Word); (7) mediation and moderation analyses (PROCESS); and (8) additional toolbox for statistics and graphics.

Maintained by Han-Wu-Shuang Bao. Last updated 10 months ago.

anova data-analysis data-science linear-models linear-regression multilevel-models statistics toolbox

176 stars 7.87 score 316 scripts 3 dependents

dbosak01

sassy:Makes 'R' Easier for Everyone

A meta-package that aims to make 'R' easier for everyone, especially programmers who have a background in 'SAS®' software. This set of packages brings many useful concepts to 'R', including data libraries, data dictionaries, formats and format catalogs, a data step, and a traceable log. The 'flagship' package is a reporting package that can output in text, rich text, 'PDF', 'HTML', and 'DOCX' file formats.

Maintained by David Bosak. Last updated 7 days ago.

21 stars 7.87 score 92 scripts

american-institutes-for-research

EdSurvey:Analysis of NCES Education Survey and Assessment Data

Read in and analyze functions for education survey and assessment data from the National Center for Education Statistics (NCES) <https://nces.ed.gov/>, including National Assessment of Educational Progress (NAEP) data <https://nces.ed.gov/nationsreportcard/> and data from the International Assessment Database: Organisation for Economic Co-operation and Development (OECD) <https://www.oecd.org/en/about/directorates/directorate-for-education-and-skills.html>, including Programme for International Student Assessment (PISA), Teaching and Learning International Survey (TALIS), Programme for the International Assessment of Adult Competencies (PIAAC), and International Association for the Evaluation of Educational Achievement (IEA) <https://www.iea.nl/>, including Trends in International Mathematics and Science Study (TIMSS), TIMSS Advanced, Progress in International Reading Literacy Study (PIRLS), International Civic and Citizenship Study (ICCS), International Computer and Information Literacy Study (ICILS), and Civic Education Study (CivEd).

Maintained by Paul Bailey. Last updated 29 days ago.

10 stars 7.86 score 139 scripts 1 dependents

tbep-tech

tbeptools:Data and Indicators for the Tampa Bay Estuary Program

Several functions are provided for working with Tampa Bay Estuary Program data and indicators, including the water quality report card, tidal creek assessments, Tampa Bay Nekton Index, Tampa Bay Benthic Index, seagrass transect data, habitat report card, and fecal indicator bacteria. Additional functions are provided for miscellaneous tasks, such as reference library curation.

Maintained by Marcus Beck. Last updated 1 days ago.

data-analysis tampa-bay tbep water-quality

10 stars 7.86 score 133 scripts

joachim-gassen

ExPanDaR:Explore Your Data Interactively

Provides a shiny-based front end (the 'ExPanD' app) and a set of functions for exploratory data analysis. Run as a web-based app, 'ExPanD' enables users to assess the robustness of empirical evidence without providing them access to the underlying data. You can export a notebook containing the analysis of 'ExPanD' and/or use the functions of the package to support your exploratory data analysis workflow. Refer to the vignettes of the package for more information on how to use 'ExPanD' and/or the functions of this package.

Maintained by Joachim Gassen. Last updated 4 years ago.

accounting eda exploratory-data-analysis finance open-science replication shiny shiny-apps

156 stars 7.80 score 203 scripts

novartis

xgxr:Exploratory Graphics for Pharmacometrics

Supports a structured approach for exploring PKPD data <https://opensource.nibr.com/xgx/>. It also contains helper functions for enabling the modeler to follow best R practices (by appending the program name, figure name location, and draft status to each plot). In addition, it enables the modeler to follow best graphical practices (by providing a theme that reduces chart ink, and by providing time-scale, log-scale, and reverse-log-transform-scale functions for more readable axes). Finally, it provides some data checking and summarizing functions for rapidly exploring pharmacokinetics and pharmacodynamics (PKPD) datasets.

Maintained by Andrew Stein. Last updated 1 years ago.

13 stars 7.76 score 105 scripts 5 dependents

mrc-ide

naomi:Naomi Model for Subnational HIV Estimates

This package implements the Naomi model for subnational HIV estimates.

Maintained by Jeff Eaton. Last updated 19 days ago.

cpp

9 stars 7.74 score 54 scripts 2 dependents

valeriapolicastro

robin:ROBustness in Network

Assesses the robustness of the community structure of a network found by one or more community detection algorithm to give indications about their reliability. It detects if the community structure found by a set of algorithms is statistically significant and compares the different selected detection algorithms on the same network. robin helps to choose among different community detection algorithms the one that better fits the network of interest. Reference in Policastro V., Righelli D., Carissimo A., Cutillo L., De Feis I. (2021) <https://journal.r-project.org/archive/2021/RJ-2021-040/index.html>.

Maintained by Valeria Policastro. Last updated 8 days ago.

19 stars 7.72 score 8 scripts

somalogic

SomaDataIO:Input/Output 'SomaScan' Data

Load and export 'SomaScan' data via the 'Standard BioTools, Inc.' structured text file called an ADAT ('*.adat'). For file format see <https://github.com/SomaLogic/SomaLogic-Data/blob/main/README.md>. The package also exports auxiliary functions for manipulating, wrangling, and extracting relevant information from an ADAT object once in memory.

Maintained by Caleb Scheidel. Last updated 2 months ago.

adat proteomics proteomics-data-analysis somascan

26 stars 7.71 score 132 scripts

proteomicslab57357

UniprotR:Retrieving Information of Proteins from Uniprot

Connect to Uniprot <https://www.uniprot.org/> to retrieve information about proteins using their accession number such information could be name or taxonomy information, For detailed information kindly read the publication <https://www.sciencedirect.com/science/article/pii/S1874391919303859>.

Maintained by Mohamed Soudy. Last updated 3 years ago.

61 stars 7.65 score 89 scripts 1 dependents

bioc

AlpsNMR:Automated spectraL Processing System for NMR

Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.

Maintained by Sergio Oller Moreno. Last updated 5 months ago.

software preprocessing visualization classification cheminformatics metabolomics dataimport

15 stars 7.59 score 12 scripts 1 dependents

mboeck11

BGVAR:Bayesian Global Vector Autoregressions

Estimation of Bayesian Global Vector Autoregressions (BGVAR) with different prior setups and the possibility to introduce stochastic volatility. Built-in priors include the Minnesota, the stochastic search variable selection and Normal-Gamma (NG) prior. For a reference see also Crespo Cuaresma, J., Feldkircher, M. and F. Huber (2016) "Forecasting with Global Vector Autoregressive Models: a Bayesian Approach", Journal of Applied Econometrics, Vol. 31(7), pp. 1371-1391 <doi:10.1002/jae.2504>. Post-processing functions allow for doing predictions, structurally identify the model with short-run or sign-restrictions and compute impulse response functions, historical decompositions and forecast error variance decompositions. Plotting functions are also available. The package has a companion paper: Boeck, M., Feldkircher, M. and F. Huber (2022) "BGVAR: Bayesian Global Vector Autoregressions with Shrinkage Priors in R", Journal of Statistical Software, Vol. 104(9), pp. 1-28 <doi:10.18637/jss.v104.i09>.

Maintained by Maximilian Boeck. Last updated 4 months ago.

openblas cpp

27 stars 7.58 score 156 scripts

pecanproject

PEcAn.LDNDC:PEcAn package for integration of the LDNDC model

This module provides functions to link the (LDNDC) to PEcAn.

Maintained by Henri Kajasilta. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 7.58 score

pecanproject

PEcAn.PRELES:PEcAn Package for Integration of the PRELES Model

This module provides functions to run the PREdict Light use efficiency Evapotranspiration and Soil moisture (PRELES) model on the PEcAn project. The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool designed to simplify the management of model parameterization,execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Tony Gardella. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 7.58 score 4 scripts

pecanproject

PEcAn.MAESPA:PEcAn Functions Used for Ecological Forecasts and Reanalysis using MAESPA

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.This package allows for MAESPA to be run through the PEcAN workflow.

Maintained by Tony Gardella. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 7.58 score 2 scripts

pecanproject

PEcAn.JULES:PEcAn Package for Integration of the JULES Model

This module provides functions to link the (JULES) to PEcAn.

Maintained by Mike Dietze. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 7.58 score

pecanproject

PEcAn.BASGRA:PEcAn Package for Integration of the BASGRA Model

This module provides functions to link the BASGRA model to PEcAn.

Maintained by Istem Fer. Last updated 3 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants fortran glibc

216 stars 7.58 score 1 scripts

sharlagelfand

opendatatoronto:Access the City of Toronto Open Data Portal

Access data from the "City of Toronto Open Data Portal" (<https://open.toronto.ca>) directly from R.

Maintained by Sharla Gelfand. Last updated 3 years ago.

63 stars 7.49 score 486 scripts

bioc

MGnifyR:R interface to EBI MGnify metagenomics resource

Utility package to facilitate integration and analysis of EBI MGnify data in R. The package can be used to import microbial data for instance into TreeSummarizedExperiment (TreeSE). In TreeSE format, the data is directly compatible with miaverse framework.

Maintained by Tuomas Borman. Last updated 2 days ago.

infrastructure dataimport metagenomics microbiome microbiomedata

21 stars 7.48 score 32 scripts

cardiomoon

editData:'RStudio' Addin for Editing a 'data.frame'

An 'RStudio' addin for editing a 'data.frame' or a 'tibble'. You can delete, add or update a 'data.frame' without coding. You can get resultant data as a 'data.frame'. In the package, modularized 'shiny' app codes are provided. These modules are intended for reuse across applications.

Maintained by Keon-Woong Moon. Last updated 4 years ago.

32 stars 7.45 score 63 scripts 5 dependents

bioc

SpatialDecon:Deconvolution of mixed cells from spatial and/or bulk gene expression data

Using spatial or bulk gene expression data, estimates abundance of mixed cell types within each observation. Based on "Advances in mixed cell deconvolution enable quantification of cell types in spatial transcriptomic data", Danaher (2022). Designed for use with the NanoString GeoMx platform, but applicable to any gene expression data.

Maintained by Maddy Griswold. Last updated 5 months ago.

immunooncology featureextraction geneexpression transcriptomics spatial

37 stars 7.41 score 58 scripts

thibautjombart

treespace:Statistical Exploration of Landscapes of Phylogenetic Trees

Tools for the exploration of distributions of phylogenetic trees. This package includes a 'shiny' interface which can be started from R using treespaceServer(). For further details see Jombart et al. (2017) <DOI:10.1111/1755-0998.12676>.

Maintained by Michelle Kendall. Last updated 2 years ago.

cpp

28 stars 7.39 score 63 scripts

eltebioinformatics

mulea:Enrichment Analysis Using Multiple Ontologies and False Discovery Rate

Background - Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. Results - mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. Conclusions - mulea is distributed as a CRAN R package. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.

Maintained by Tamas Stirling. Last updated 4 months ago.

annotation differentialexpression geneexpression genesetenrichment go graphandnetwork multiplecomparison pathways reactome software transcription visualization enrichment enrichment-analysis functional-enrichment-analysis gene-set-enrichment ontologies transcriptomics cpp

28 stars 7.36 score 34 scripts

doi-usgs

toxEval:Exploring Biological Relevance of Environmental Chemistry Observations

Data analysis package for estimating potential biological effects from chemical concentrations in environmental samples. Included are a set of functions to analyze, visualize, and organize measured concentration data as it relates to user-selected chemical-biological interaction benchmark data such as water quality criteria. The intent of these analyses is to develop a better understanding of the potential biological relevance of environmental chemistry data. Results can be used to prioritize which chemicals at which sites may be of greatest concern. These methods are meant to be used as a screening technique to predict potential for biological influence from chemicals that ultimately need to be validated with direct biological assays. A description of the analysis can be found in Blackwell (2017) <doi:10.1021/acs.est.7b01613>.

Maintained by Laura DeCicco. Last updated 4 months ago.

toxicity water-quality

21 stars 7.34 score 58 scripts

bioc

gDRimport:Package for handling the import of dose-response data

The package is a part of the gDR suite. It helps to prepare raw drug response data for downstream processing. It mainly contains helper functions for importing/loading/validating dose-response data provided in different file formats.

Maintained by Arkadiusz Gladki. Last updated 2 days ago.

software infrastructure dataimport

3 stars 7.32 score 5 scripts 1 dependents

ibecav

CGPfunctions:Powell Miscellaneous Functions for Teaching and Learning Statistics

Miscellaneous functions useful for teaching statistics as well as actually practicing the art. They typically are not new methods but rather wrappers around either base R or other packages.

Maintained by Chuck Powell. Last updated 4 years ago.

27 stars 7.28 score 122 scripts

bioc

CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems

The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.

Maintained by Lihua Julie Zhu. Last updated 19 days ago.

immunooncology generegulation sequencematching crispr

7.18 score 51 scripts 2 dependents

john-harrold

ubiquity:PKPD, PBPK, and Systems Pharmacology Modeling Tools

Complete work flow for the analysis of pharmacokinetic pharmacodynamic (PKPD), physiologically-based pharmacokinetic (PBPK) and systems pharmacology models including: creation of ordinary differential equation-based models, pooled parameter estimation, individual/population based simulations, rule-based simulations for clinical trial design and modeling assays, deployment with a customizable 'Shiny' app, and non-compartmental analysis. System-specific analysis templates can be generated and each element includes integrated reporting with 'PowerPoint' and 'Word'.

Maintained by John Harrold. Last updated 8 days ago.

modeling pkpd

13 stars 7.14 score 33 scripts

roelandkindt

BiodiversityR:Package for Community Ecology and Suitability Analysis

Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.

Maintained by Roeland Kindt. Last updated 2 months ago.

17 stars 7.13 score 390 scripts 2 dependents

bioc

GeomxTools:NanoString GeoMx Tools

Tools for NanoString Technologies GeoMx Technology. Package provides functions for reading in DCC and PKC files based on an ExpressionSet derived object. Normalization and QC functions are also included.

Maintained by Maddy Griswold. Last updated 5 months ago.

geneexpression transcription cellbasedassays dataimport transcriptomics proteomics mrnamicroarray proprietaryplatforms rnaseq sequencing experimentaldesign normalization spatial

7.11 score 239 scripts 3 dependents

farrellday

miceRanger:Multiple Imputation by Chained Equations with Random Forests

Multiple Imputation has been shown to be a flexible method to impute missing values by Van Buuren (2007) <doi:10.1177/0962280206074463>. Expanding on this, random forests have been shown to be an accurate model by Stekhoven and Buhlmann <arXiv:1105.0828> to impute missing values in datasets. They have the added benefits of returning out of bag error and variable importance estimates, as well as being simple to run in parallel.

Maintained by Sam Wilson. Last updated 3 years ago.

imputation-methods machine-learning mice missing-data missing-values random-forests

67 stars 7.09 score 41 scripts 1 dependents

john-harrold

ruminate:A Pharmacometrics Data Transformation and Analysis Tool

Exploration of pharmacometrics data involves both general tools (transformation and plotting) and specific techniques (non-compartmental analysis). This kind of exploration is generally accomplished by utilizing different packages. The purpose of 'ruminate' is to create a 'shiny' interface to make these tools more broadly available while creating reproducible results.

Maintained by John Harrold. Last updated 19 days ago.

2 stars 7.06 score 84 scripts

john-d-fox

RcmdrMisc:R Commander Miscellaneous Functions

Various statistical, graphics, and data-management functions used by the Rcmdr package in the R Commander GUI for R.

Maintained by John Fox. Last updated 2 years ago.

1 stars 7.02 score 432 scripts 42 dependents

bioc

musicatk:Mutational Signature Comprehensive Analysis Toolkit

Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.

Maintained by Joshua D. Campbell. Last updated 5 months ago.

software biologicalquestion somaticmutation variantannotation

13 stars 6.97 score 20 scripts

cmerow

rangeModelMetadata:Provides Templates for Metadata Files Associated with Species Range Models

Range Modeling Metadata Standards (RMMS) address three challenges: they (i) are designed for convenience to encourage use, (ii) accommodate a wide variety of applications, and (iii) are extensible to allow the community of range modelers to steer it as needed. RMMS are based on a data dictionary that specifies a hierarchical structure to catalog different aspects of the range modeling process. The dictionary balances a constrained, minimalist vocabulary to improve standardization with flexibility for users to provide their own values. Merow et al. (2019) <DOI:10.1111/geb.12993> describe the standards in more detail. Note that users who prefer to use the R package 'ecospat' can obtain it from <https://github.com/ecospat/ecospat>.

Maintained by Cory Merow. Last updated 8 months ago.

ecological-metadata-language ecological-modelling ecological-models ecology species-distribution-modelling species-distributions

6 stars 6.96 score 16 scripts 3 dependents

statisticsgreenland

pxmake:Make PX-Files in R

Create PX-files from scratch or read and modify existing ones. Includes a function for every PX keyword, making metadata manipulation simple and human-readable.

Maintained by Johan Ejstrud. Last updated 23 hours ago.

9 stars 6.95 score 11 scripts

danlwarren

ENMTools:Analysis of Niche Evolution using Niche and Distribution Models

Constructing niche models and analyzing patterns of niche evolution. Acts as an interface for many popular modeling algorithms, and allows users to conduct Monte Carlo tests to address basic questions in evolutionary ecology and biogeography. Warren, D.L., R.E. Glor, and M. Turelli (2008) <doi:10.1111/j.1558-5646.2008.00482.x> Glor, R.E., and D.L. Warren (2011) <doi:10.1111/j.1558-5646.2010.01177.x> Warren, D.L., R.E. Glor, and M. Turelli (2010) <doi:10.1111/j.1600-0587.2009.06142.x> Cardillo, M., and D.L. Warren (2016) <doi:10.1111/geb.12455> D.L. Warren, L.J. Beaumont, R. Dinnage, and J.B. Baumgartner (2019) <doi:10.1111/ecog.03900>.

Maintained by Dan Warren. Last updated 3 months ago.

105 stars 6.91 score 126 scripts

pik-piam

edgeTransport:Prepare EDGE Transport Data for the REMIND model

EDGE-T is a fork of the GCAM transport module https://jgcri.github.io/gcam-doc/energy.html#transportation with a high level of detail in its representation of technological and modal options. It is a partial equilibrium model with a nested multinomial logit structure and relies on the modified logit formulation. Most of the sources are not publicly available. PIK-internal users can find the sources in the distributed file system in the folder `/p/projects/rd3mod/inputdata/sources/EDGE-Transport-Standalone`.

Maintained by Johanna Hoppe. Last updated 5 hours ago.

5 stars 6.84 score 16 scripts 2 dependents

raymondbalise

rUM:R Templates from the University of Miami

This holds some r markdown and quarto templates and a template to create a research project in "R Studio".

Maintained by Raymond Balise. Last updated 8 days ago.

rmarkdown

9 stars 6.84 score 16 scripts

stla

qspray:Multivariate Polynomials with Rational Coefficients

Symbolic calculation and evaluation of multivariate polynomials with rational coefficients. This package is strongly inspired by the 'spray' package. It provides a function to compute Gröbner bases (reference <doi:10.1007/978-3-319-16721-3>). It also includes some features for symmetric polynomials, such as the Hall inner product. The header file of the C++ code can be used by other packages. It provides the templated class 'Qspray' that can be used to represent and to deal with multivariate polynomials with another type of coefficients.

Maintained by Stéphane Laurent. Last updated 7 months ago.

gmp polynomials cpp

4 stars 6.81 score 152 scripts 5 dependents

cardiomoon

webr:Data and Functions for Web-Based Analysis

Several analysis-related functions for the book entitled "Web-based Analysis without R in Your Computer"(written in Korean, ISBN 978-89-5566-185-9) by Keon-Woong Moon. The main function plot.htest() shows the distribution of statistic for the object of class 'htest'.

Maintained by Keon-Woong Moon. Last updated 5 years ago.

33 stars 6.80 score 181 scripts

tom-wolff

ideanet:Integrating Data Exchange and Analysis for Networks ('ideanet')

A suite of convenient tools for social network analysis geared toward students, entry-level users, and non-expert practitioners. ‘ideanet’ features unique functions for the processing and measurement of sociocentric and egocentric network data. These functions automatically generate node- and system-level measures commonly used in the analysis of these types of networks. Outputs from these functions maximize the ability of novice users to employ network measurements in further analyses while making all users less prone to common data analytic errors. Additionally, ‘ideanet’ features an R Shiny graphic user interface that allows novices to explore network data with minimal need for coding.

Maintained by Tom Wolff. Last updated 16 days ago.

6 stars 6.80 score 10 scripts

wadpac

GGIRread:Wearable Accelerometer Data File Readers

Reads data collected from wearable acceleratometers as used in sleep and physical activity research. Currently supports file formats: binary data from 'GENEActiv' <https://activinsights.com/>, .bin-format from GENEA devices (not for sale), and .cwa-format from 'Axivity' <https://axivity.com>. Further, it has functions for reading text files with epoch level aggregates from 'Actical', 'Fitbit', 'Actiwatch', 'ActiGraph', and 'PhilipsHealthBand'. Primarily designed to complement R package GGIR <https://CRAN.R-project.org/package=GGIR>.

Maintained by Vincent T van Hees. Last updated 1 days ago.

cpp

5 stars 6.78 score 33 scripts 5 dependents

hegghammer

daiR:Interface with Google Cloud Document AI API

R interface for the Google Cloud Services 'Document AI API' <https://cloud.google.com/document-ai/> with additional tools for output file parsing and text reconstruction. 'Document AI' is a powerful server-based OCR service that extracts text and tables from images and PDF files with high accuracy. 'daiR' gives R users programmatic access to this service and additional tools to handle and visualize the output. See the package website <https://dair.info/> for more information and examples.

Maintained by Thomas Hegghammer. Last updated 5 months ago.

google-cloud ocr

42 stars 6.77 score 40 scripts

harrison4192

autostats:Auto Stats

Automatically do statistical exploration. Create formulas using 'tidyselect' syntax, and then determine cross-validated model accuracy and variable contributions using 'glm' and 'xgboost'. Contains additional helper functions to create and modify formulas. Has a flagship function to quickly determine relationships between categorical and continuous variables in the data set.

Maintained by Harrison Tietze. Last updated 25 days ago.

6 stars 6.76 score 5 scripts 2 dependents

richardhooijmaijers

shinyMixR:Interactive 'shiny' Dashboard for 'nlmixr2'

An R shiny user interface for the 'nlmixr2' (Fidler et al (2019) <doi:10.1002/psp4.12445>) package, designed to simplify the modeling process for users. Additionally, this package includes supplementary functions to further enhances the usage of 'nlmixr2'.

Maintained by Richard Hooijmaijers. Last updated 5 months ago.

11 stars 6.74 score 28 scripts

imbi-heidelberg

DescrTab2:Publication Quality Descriptive Statistics Tables

Provides functions to create descriptive statistics tables for continuous and categorical variables. By default, summary statistics such as mean, standard deviation, quantiles, minimum and maximum for continuous variables and relative and absolute frequencies for categorical variables are calculated. 'DescrTab2' features a sophisticated algorithm to choose appropriate test statistics for your data and provides p-values. On top of this, confidence intervals for group differences of appropriated summary measures are automatically produces for two-group comparison. Tables generated by 'DescrTab2' can be integrated in a variety of document formats, including .html, .tex and .docx documents. 'DescrTab2' also allows printing tables to console and saving table objects for later use.

Maintained by Jan Meis. Last updated 1 years ago.

categorical-variables continuous-variable descriptive-statistics p-values statistical-tests statistics

9 stars 6.71 score 19 scripts 1 dependents

pik-piam

mrcommons:MadRat commons Input Data Library

Provides useful functions and a common structure to all the input data required to run models like MAgPIE and REMIND of model input data.

Maintained by Jan Philipp Dietrich. Last updated 1 days ago.

1 stars 6.70 score 16 scripts 15 dependents

harrison4192

presenter:Present Data with Style

Consists of custom wrapper functions using packages 'openxlsx', 'flextable', and 'officer' to create highly formatted MS office friendly output of your data frames. These viewer friendly outputs are intended to match expectations of professional looking presentations in business and consulting scenarios. The functions are opinionated in the sense that they expect the input data frame to have certain properties in order to take advantage of the automated formatting.

Maintained by Harrison Tietze. Last updated 2 years ago.

excel powerpoint

11 stars 6.69 score 15 scripts 4 dependents

tetratech

baytrends:Long Term Water Quality Trend Analysis

Enable users to evaluate long-term trends using a Generalized Additive Modeling (GAM) approach. The model development includes selecting a GAM structure to describe nonlinear seasonally-varying changes over time, incorporation of hydrologic variability via either a river flow or salinity, the use of an intervention to deal with method or laboratory changes suspected to impact data values, and representation of left- and interval-censored data. The approach has been applied to water quality data in the Chesapeake Bay, a major estuary on the east coast of the United States to provide insights to a range of management- and research-focused questions. Methodology described in Murphy (2019) <doi:10.1016/j.envsoft.2019.03.027>.

Maintained by Erik W Leppo. Last updated 5 months ago.

12 stars 6.67 score 97 scripts

briencj

growthPheno:Functional Analysis of Phenotypic Growth Data to Smooth and Extract Traits

Assists in the plotting and functional smoothing of traits measured over time and the extraction of features from these traits, implementing the SET (Smoothing and Extraction of Traits) method described in Brien et al. (2020) Plant Methods, 16. Smoothing of growth trends for individual plants using natural cubic smoothing splines or P-splines is available for removing transient effects and segmented smoothing is available to deal with discontinuities in growth trends. There are graphical tools for assessing the adequacy of trait smoothing, both when using this and other packages, such as those that fit nonlinear growth models. A range of per-unit (plant, pot, plot) growth traits or features can be extracted from the data, including single time points, interval growth rates and other growth statistics, such as maximum growth or days to maximum growth. The package also has tools adapted to inputting data from high-throughput phenotyping facilities, such from a Lemna-Tec Scananalyzer 3D (see <https://www.youtube.com/watch?v=MRAF_mAEa7E/> for more information). The package 'growthPheno' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 13 days ago.

6 stars 6.66 score 42 scripts

pik-piam

piamInterfaces:Project specific interfaces to REMIND / MAgPIE

Project specific interfaces to REMIND / MAgPIE.

Maintained by Falk Benke. Last updated 5 days ago.

6.64 score 38 scripts 7 dependents

mattcowgill

readrba:Download and Tidy Data from the Reserve Bank of Australia

Download up-to-date data from the Reserve Bank of Australia in a tidy data frame. Package includes functions to download current and historical statistical tables (<https://www.rba.gov.au/statistics/tables/>) and forecasts (<https://www.rba.gov.au/publications/smp/forecasts-archive.html>). Data includes a broad range of Australian macroeconomic and financial time series.

Maintained by Matt Cowgill. Last updated 4 months ago.

27 stars 6.62 score 26 scripts

theomargel

ProtE:Processing Proteomics Data, Statistical Analysis and Visualization

The 'Proteomics Eye' ('ProtE') offers a comprehensive and intuitive framework for the univariate analysis of label-free proteomics data. By integrating essential data wrangling and processing steps into a single function, 'ProtE' streamlines pairwise statistical comparisons for categorical variables. It provides quality checks and generates publication-ready visualizations, enabling efficient and robust data analysis. 'ProtE' is compatible with proteomics data outputs from 'MaxQuant' (Cox & Mann, (2008) <doi:10.1038/nbt.1511>), 'DIA-NN' (Demichev et al., (2020) <doi:10.1038/s41592-019-0638-x>), and 'Proteome Discoverer' (Thermo Fisher Scientific, version 2.5). The package leverages 'ggplot2' for visualization (Wickham, (2016) <doi:10.1007/978-3-319-24277-4>) and 'limma' for statistical analysis (Ritchie et al., (2015) <doi:10.1093/nar/gkv007>).

Maintained by Theodoros Margelos. Last updated 10 days ago.

6.61 score 2 scripts

mini-pw

PvSTATEM:Reading, Quality Control and Preprocessing of MBA (Multiplex Bead Assay) Data

Speeds up the process of loading raw data from MBA (Multiplex Bead Assay) examinations, performs quality control checks, and automatically normalises the data, preparing it for more advanced, downstream tasks. The main objective of the package is to create a simple environment for a user, who does not necessarily have experience with R language. The package is developed within the project of the same name - 'PvSTATEM', which is an international project aiming for malaria elimination.

Maintained by Tymoteusz Kwiecinski. Last updated 30 days ago.

3 stars 6.56 score 7 scripts

rmi-pacta

pacta.multi.loanbook:Run 'PACTA' on Multiple Loan Books Easily

Run Paris Agreement Capital Transition Assessment ('PACTA') analyses on multiple loan books in a structured way. Provides access to standard 'PACTA' metrics and additional 'PACTA'-related metrics for multiple loan books. Results take the form of 'csv' files and plots and are exported to user-specified project paths.

Maintained by Jacob Kastl. Last updated 15 days ago.

climate-change pacta pactaverse sustainable-finance

6.48 score 4 scripts

jazznbass

scan:Single-Case Data Analyses for Single and Multiple Baseline Designs

A collection of procedures for analysing, visualising, and managing single-case data. These include piecewise linear regression models, multilevel models, overlap indices ('PND', 'PEM', 'PAND', 'PET', 'tau-u', 'baseline corrected tau', 'CDC'), and randomization tests. Data preparation functions support outlier detection, handling missing values, scaling, and custom transformations. An export function helps to generate html, word, and latex tables in a publication friendly style. More details can be found in the online book 'Analyzing single-case data with R and scan', Juergen Wilbert (2025) <https://jazznbass.github.io/scan-Book/>.

Maintained by Juergen Wilbert. Last updated 10 days ago.

4 stars 6.47 score 62 scripts 1 dependents

huanglabumn

oncoPredict:Drug Response Modeling and Biomarker Discovery

Allows for building drug response models using screening data between bulk RNA-Seq and a drug response metric and two additional tools for biomarker discovery that have been developed by the Huang Laboratory at University of Minnesota. There are 3 main functions within this package. (1) calcPhenotype is used to build drug response models on RNA-Seq data and impute them on any other RNA-Seq dataset given to the model. (2) GLDS is used to calculate the general level of drug sensitivity, which can improve biomarker discovery. (3) IDWAS can take the results from calcPhenotype and link the imputed response back to available genomic (mutation and CNV alterations) to identify biomarkers. Each of these functions comes from a paper from the Huang research laboratory. Below gives the relevant paper for each function. calcPhenotype - Geeleher et al, Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. GLDS - Geeleher et al, Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models. IDWAS - Geeleher et al, Discovering novel pharmacogenomic biomarkers by imputing drug response in cancer patients from large genomics studies.

Maintained by Robert Gruener. Last updated 12 months ago.

sva preprocesscore stringr biomart genefilter org.hs.eg.db genomicfeatures txdb.hsapiens.ucsc.hg19.knowngene tcgabiolinks biocgenerics genomicranges iranges s4vectors

18 stars 6.47 score 41 scripts

cardiomoon

rrtable:Reproducible Research with a Table of R Codes

Makes documents containing plots and tables from a table of R codes. Can make "HTML", "pdf('LaTex')", "docx('MS Word')" and "pptx('MS Powerpoint')" documents with or without R code. In the package, modularized 'shiny' app codes are provided. These modules are intended for reuse across applications.

Maintained by Keon-Woong Moon. Last updated 2 years ago.

3 stars 6.45 score 76 scripts 2 dependents

pik-piam

mrwater:madrat based MAgPIE water Input Data Library

Provides functions for MAgPIE cellular input data generation and stand-alone water calculations.

Maintained by Felicitas Beier. Last updated 5 months ago.

6.45 score 4 scripts 3 dependents

agrdatasci

gosset:Tools for Data Analysis in Experimental Agriculture

Methods to analyse experimental agriculture data, from data synthesis to model selection and visualisation. The package is named after W.S. Gosset aka ‘Student’, a pioneer of modern statistics in small sample experimental design and analysis.

Maintained by Kauê de Sousa. Last updated 4 months ago.

experimental-design rankings-data

6 stars 6.44 score 23 scripts

luisdva

unheadr:Handle Data with Messy Header Rows and Broken Values

Verb-like functions to work with messy data, often derived from spreadsheets or parsed PDF tables. Includes functions for unwrapping values broken up across rows, relocating embedded grouping values, and to annotate meaningful formatting in spreadsheet files.

Maintained by Luis D. Verde Arregoitia. Last updated 11 months ago.

61 stars 6.44 score 45 scripts

ethanbass

chromConverter:Chromatographic File Converter

Reads chromatograms from binary formats into R objects. Currently supports conversion of 'Agilent ChemStation', 'Agilent MassHunter', 'Shimadzu LabSolutions', 'ThermoRaw', and 'Varian Workstation' files as well as various text-based formats. In addition to its internal parsers, chromConverter contains bindings to parsers in external libraries, such as 'Aston' <https://github.com/bovee/aston>, 'Entab' <https://github.com/bovee/entab>, 'rainbow' <https://rainbow-api.readthedocs.io/>, and 'ThermoRawFileParser' <https://github.com/compomics/ThermoRawFileParser>.

Maintained by Ethan Bass. Last updated 3 days ago.

cheminformatics chromatography fair-data gc-fid hplc hplc-dad hplc-uv metabolomics metabolomics-data open-data open-science

33 stars 6.43 score 16 scripts 2 dependents

pik-piam

mrdrivers:Create GDP and Population Scenarios

Create GDP and population scenarios This package constructs the GDP and population scenarios used as drivers in both the REMIND and MAgPIE models.

Maintained by Johannes Koch. Last updated 8 days ago.

6.41 score 5 scripts 19 dependents

ocean-tracking-network

glatos:A package for the Great Lakes Acoustic Telemetry Observation System

Functions useful to members of the Great Lakes Acoustic Telemetry Observation System https://glatos.glos.us; many more broadly relevant to simulating, processing, analysing, and visualizing acoustic telemetry data.

Maintained by Christopher Holbrook. Last updated 6 months ago.

10 stars 6.38 score 112 scripts

ropensci

spiro:Manage Data from Cardiopulmonary Exercise Testing

Import, process, summarize and visualize raw data from metabolic carts. See Robergs, Dwyer, and Astorino (2010) <doi:10.2165/11319670-000000000-00000> for more details on data processing.

Maintained by Simon Nolte. Last updated 1 months ago.

14 stars 6.38 score 43 scripts

anthonydevaux

DynForest:Random Forest with Multivariate Longitudinal Predictors

Based on random forest principle, 'DynForest' is able to include multiple longitudinal predictors to provide individual predictions. Longitudinal predictors are modeled through the random forest. The methodology is fully described for a survival outcome in: Devaux, Helmer, Genuer & Proust-Lima (2023) <doi: 10.1177/09622802231206477>.

Maintained by Anthony Devaux. Last updated 5 months ago.

16 stars 6.38 score 8 scripts

ethanbass

chromatographR:Chromatographic Data Analysis Toolset

Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in 'alsace' (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. <doi:10.1093/bioinformatics/btv299>. Alignment of chromatograms is available using parametric time warping (as implemented in the 'ptw' package) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. <doi:10.1093/bioinformatics/btv299>) or variable penalty dynamic time warping (as implemented in 'VPdtw') (Clifford, D., & Stone, G. 2012. <doi:10.18637/jss.v047.i08>). Peak-finding uses the algorithm by Tom O'Haver <https://terpconnect.umd.edu/~toh/spectrum/PeakFindingandMeasurement.htm>. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. <doi:10.1016/S0021-9673(01)00594-5>). See the vignette for more details and suggested workflow.

Maintained by Ethan Bass. Last updated 13 days ago.

bioinformatics cheminformatics chromatography gc-fid hplc hplc-dad hplc-pda hplv-uv metabolomics open-data open-science reproducibility reproducible-research

18 stars 6.36 score 8 scripts 1 dependents

pharmaverse

metatools:Enable the Use of 'metacore' to Help Create and Check Dataset

Uses the metadata information stored in 'metacore' objects to check and build metadata associated columns.

Maintained by Christina Fillmore. Last updated 2 months ago.

12 stars 6.33 score 120 scripts

moore-institute-4-plastic-pollution-res

One4All:Validate, Share, and Download Data

Designed to enhance data validation and management processes by employing a set of functions that read a set of rules from a 'CSV' or 'Excel' file and apply them to a dataset. Funded by the National Renewable Energy Laboratory and Possibility Lab, maintained by the Moore Institute for Plastic Pollution Research.

Maintained by Hannah Sherrod. Last updated 9 months ago.

3 stars 6.33 score 15 scripts

sbg

sevenbridges2:The 'Seven Bridges Platform' API Client

R client and utilities for 'Seven Bridges Platform' API, from 'Cancer Genomics Cloud' to other 'Seven Bridges' supported platforms. API documentation is hosted publicly at <https://docs.sevenbridges.com/docs/the-api>.

Maintained by Marko Trifunovic. Last updated 4 days ago.

api-client bioinformatics cloud sevenbridges

3 stars 6.32 score 4 scripts

bioc

MSstatsShiny:MSstats GUI for Statistical Anaylsis of Proteomics Experiments

MSstatsShiny is an R-Shiny graphical user interface (GUI) integrated with the R packages MSstats, MSstatsTMT, and MSstatsPTM. It provides a point and click end-to-end analysis pipeline applicable to a wide variety of experimental designs. These include data-dependedent acquisitions (DDA) which are label-free or tandem mass tag (TMT)-based, as well as DIA, SRM, and PRM acquisitions and those targeting post-translational modifications (PTMs). The application automatically saves users selections and builds an R script that recreates their analysis, supporting reproducible data analysis.

Maintained by Devon Kohler. Last updated 5 months ago.

immunooncology massspectrometry proteomics software shinyapps differentialexpression onechannel twochannel normalization qualitycontrol gui

15 stars 6.31 score 4 scripts

pepijn-devries

ECOTOXr:Download and Extract Data from US EPA's ECOTOX Database

The US EPA ECOTOX database is a freely available database with a treasure of aquatic and terrestrial ecotoxicological data. As the online search interface doesn't come with an API, this package provides the means to easily access and search the database in R. To this end, all raw tables are downloaded from the EPA website and stored in a local SQLite database <doi:10.1016/j.chemosphere.2024.143078>.

Maintained by Pepijn de Vries. Last updated 3 days ago.

10 stars 6.30 score 6 scripts

mini-pw

SerolyzeR:Reading, Quality Control and Preprocessing of MBA (Multiplex Bead Assay) Data

Speeds up the process of loading raw data from MBA (Multiplex Bead Assay) examinations, performs quality control checks, and automatically normalises the data, preparing it for more advanced, downstream tasks. The main objective of the package is to create a simple environment for a user, who does not necessarily have experience with R language. The package is developed within the project 'PvSTATEM', which is an international project aiming for malaria elimination.

Maintained by Tymoteusz Kwiecinski. Last updated 10 days ago.

6 stars 6.29 score

phuse-org

sendigR:Enable Cross-Study Analysis of 'CDISC' 'SEND' Datasets

A system enables cross study Analysis by extracting and filtering study data for control animals from 'CDISC' 'SEND' Study Repository. These data types are supported: Body Weights, Laboratory test results and Microscopic findings. These database types are supported: 'SQLite' and 'Oracle'.

Maintained by Wenxian Wang. Last updated 23 days ago.

12 stars 6.28 score 6 scripts

bioc

iSEEtree:Interactive visualisation for microbiome data

iSEEtree is an extension of iSEE for the TreeSummarizedExperiment data container. It provides interactive panel designs to explore hierarchical datasets, such as the microbiome and cell lines.

Maintained by Giulio Benedetti. Last updated 10 days ago.

software visualization microbiome gui shinyapps dataimport shiny-apps visualisation

3 stars 6.28 score 5 scripts

aphalo

photobiologyInOut:Read Spectral and Logged Data from Foreign Files

Functions for reading, and in some cases writing, foreign files containing spectral data from spectrometers and their associated software, output from daylight simulation models in common use, and some spectral data repositories. As well as functions for exchange of spectral data with other R packages. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 30 days ago.

6.28 score 63 scripts 1 dependents

pcruniversum

RDML:Importing Real-Time Thermo Cycler (qPCR) Data from RDML Format Files

Imports real-time thermo cycler (qPCR) data from Real-time PCR Data Markup Language (RDML) and transforms to the appropriate formats of the 'qpcR' and 'chipPCR' packages. Contains a dendrogram visualization for the structure of RDML object and GUI for RDML editing.

Maintained by Konstantin A. Blagodatskikh. Last updated 7 months ago.

bioinformatics pcr qpcr rdml

21 stars 6.26 score 58 scripts 1 dependents

pik-piam

mrremind:MadRat REMIND Input Data Package

The mrremind packages contains data preprocessing for the REMIND model.

Maintained by Lavinia Baumstark. Last updated 1 days ago.

4 stars 6.25 score 15 scripts 1 dependents

tzerk

RLumShiny:'Shiny' Applications for the R Package 'Luminescence'

A collection of 'shiny' applications for the R package 'Luminescence'. These mainly, but not exclusively, include applications for plotting chronometric data from e.g. luminescence or radiocarbon dating. It further provides access to bootstraps tooltip and popover functionality and contains the 'jscolor.js' library with a custom 'shiny' output binding.

Maintained by Christoph Burow. Last updated 3 days ago.

bootstrap jscolor luminescence luminescence-dating shiny shiny-applications tooltip

7 stars 6.23 score 67 scripts 2 dependents

sjmack

HLAtools:Toolkit for HLA Immunogenomics

A toolkit for the analysis and management of data for genes in the so-called "Human Leukocyte Antigen" (HLA) region. Functions extract reference data from the Anthony Nolan HLA Informatics Group/ImmunoGeneTics HLA 'GitHub' repository (ANHIG/IMGTHLA) <https://github.com/ANHIG/IMGTHLA>, validate Genotype List (GL) Strings, convert between UNIFORMAT and GL String Code (GLSC) formats, translate HLA alleles and GLSCs across ImmunoPolymorphism Database (IPD) IMGT/HLA Database release versions, identify differences between pairs of alleles at a locus, generate customized, multi-position sequence alignments, trim and convert allele-names across nomenclature epochs, and extend existing data-analysis methods.

Maintained by Steven Mack. Last updated 26 days ago.

4 stars 6.21 score 7 scripts 1 dependents

dmenne

breathtestcore:Core Functions to Read and Fit 13c Time Series from Breath Tests

Reads several formats of 13C data (IRIS/Wagner, BreathID) and CSV. Creates artificial sample data for testing. Fits Maes/Ghoos, Bluck-Coward self-correcting formula using 'nls', 'nlme'. Methods to fit breath test curves with Bayesian Stan methods are refactored to package 'breathteststan'. For a Shiny GUI, see package 'dmenne/breathtestshiny' on github.

Maintained by Dieter Menne. Last updated 3 months ago.

13c breath breath-test gastroenterology medical stan

2 stars 6.19 score 64 scripts 1 dependents

radiant-rstats

radiant.model:Model Menu for Radiant: Business Analytics using R and Shiny

The Radiant Model menu includes interfaces for linear and logistic regression, naive Bayes, neural networks, classification and regression trees, model evaluation, collaborative filtering, decision analysis, and simulation. The application extends the functionality in 'radiant.data'.

Maintained by Vincent Nijs. Last updated 6 months ago.

19 stars 6.18 score 80 scripts 2 dependents

giobo

TR8:A Tool for Downloading Functional Traits Data for Plant Species

Plant ecologists often need to collect "traits" data about plant species which are often scattered among various databases: TR8 contains a set of tools which take care of automatically retrieving some of those functional traits data for plant species from publicly available databases (The Ecological Flora of the British Isles, LEDA traitbase, Ellenberg values for Italian Flora, Mycorrhizal intensity databases, BROT, PLANTS, Jepson Flora Project). The TR8 name, inspired by "car plates" jokes, was chosen since it both reminds of the main object of the package and is extremely short to type.

Maintained by Gionata Bocci. Last updated 6 months ago.

21 stars 6.18 score 16 scripts

nataliepatten

gatoRs:Geographic and Taxonomic Occurrence R-Based Scrubbing

Streamlines downloading and cleaning biodiversity data from Integrated Digitized Biocollections (iDigBio) and the Global Biodiversity Information Facility (GBIF).

Maintained by Natalie N. Patten. Last updated 11 months ago.

11 stars 6.16 score 66 scripts

alexchristensen

SemNetCleaner:An Automated Cleaning Tool for Semantic and Linguistic Data

Implements several functions that automates the cleaning and spell-checking of text data. Also converges, finalizes, removes plurals and continuous strings, and puts text data in binary format for semantic network analysis. Uses the 'SemNetDictionaries' package to make the cleaning process more accurate, efficient, and reproducible.

Maintained by Alexander P. Christensen. Last updated 3 years ago.

preprocessing semantic-network-analysis

10 stars 6.16 score 48 scripts 1 dependents

verbal-autopsy-software

openVA:Automated Method for Verbal Autopsy

Implements multiple existing open-source algorithms for coding cause of death from verbal autopsies. The methods implemented include 'InterVA4' by Byass et al (2012) <doi:10.3402/gha.v5i0.19281>, 'InterVA5' by Byass at al (2019) <doi:10.1186/s12916-019-1333-6>, 'InSilicoVA' by McCormick et al (2016) <doi:10.1080/01621459.2016.1152191>, 'NBC' by Miasnikof et al (2015) <doi:10.1186/s12916-015-0521-2>, and a replication of 'Tariff' method by James et al (2011) <doi:10.1186/1478-7954-9-31> and Serina, et al. (2015) <doi:10.1186/s12916-015-0527-9>. It also provides tools for data manipulation tasks commonly used in Verbal Autopsy analysis and implements easy graphical visualization of individual and population level statistics. The 'NBC' method is implemented by the 'nbc4va' package that can be installed from <https://github.com/rrwen/nbc4va>. Note that this package was not developed by authors affiliated with the Institute for Health Metrics and Evaluation and thus unintentional discrepancies may exist in the implementation of the 'Tariff' method.

Maintained by Zehang Richard Li. Last updated 1 years ago.

pipeline openjdk

5 stars 6.16 score 48 scripts

eikeluedeling

chillR:Statistical Methods for Phenology Analysis in Temperate Fruit Trees

The phenology of plants (i.e. the timing of their annual life phases) depends on climatic cues. For temperate trees and many other plants, spring phases, such as leaf emergence and flowering, have been found to result from the effects of both cool (chilling) conditions and heat. Fruit tree scientists (pomologists) have developed some metrics to quantify chilling and heat (e.g. see Luedeling (2012) <doi:10.1016/j.scienta.2012.07.011>). 'chillR' contains functions for processing temperature records into chilling (Chilling Hours, Utah Chill Units and Chill Portions) and heat units (Growing Degree Hours). Regarding chilling metrics, Chill Portions are often considered the most promising, but they are difficult to calculate. This package makes it easy. 'chillR' also contains procedures for conducting a PLS analysis relating phenological dates (e.g. bloom dates) to either mean temperatures or mean chill and heat accumulation rates, based on long-term weather and phenology records (Luedeling and Gassner (2012) <doi:10.1016/j.agrformet.2011.10.020>). As of version 0.65, it also includes functions for generating weather scenarios with a weather generator, for conducting climate change analyses for temperature-based climatic metrics and for plotting results from such analyses. Since version 0.70, 'chillR' contains a function for interpolating hourly temperature records.

Maintained by Eike Luedeling. Last updated 5 months ago.

cpp

3 stars 6.13 score 346 scripts 1 dependents

nlmixr2

babelmixr2:Use 'nlmixr2' to Interact with Open Source and Commercial Software

Run other estimation and simulation software via the 'nlmixr2' (Fidler et al (2019) <doi:10.1002/psp4.12445>) interface including 'PKNCA', 'NONMEM' and 'Monolix'. While not required, you can get/install the 'lixoftConnectors' package in the 'Monolix' installation, as described at the following url <https://monolixsuite.slp-software.com/r-functions/2024R1/installation-and-initialization>. When 'lixoftConnectors' is available, 'Monolix' can be run directly instead of setting up command line usage.

Maintained by Matthew Fidler. Last updated 18 days ago.

monolix nonmem pharmacometrics cpp

9 stars 6.11 score 53 scripts

pik-piam

luplot:Landuse Plot Library

Some useful functions to plot data such as a map plot function for MAgPIE objects.

Maintained by Benjamin Bodirsky. Last updated 2 months ago.

6.09 score 124 scripts 11 dependents

bioc

Pedixplorer:Pedigree Functions

Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.

Maintained by Louis Le Nezet. Last updated 14 days ago.

software datarepresentation genetics graphandnetwork visualization kinship pedigree

2 stars 6.08 score 10 scripts

gabriel-assuncao

PNADcIBGE:Downloading, Reading and Analyzing PNADC Microdata

Provides tools for downloading, reading and analyzing the Continuous National Household Sample Survey - PNADC, a household survey from Brazilian Institute of Geography and Statistics - IBGE. The data must be downloaded from the official website <https://www.ibge.gov.br/>. Further analysis must be made using package 'survey'.

Maintained by Gabriel Assuncao. Last updated 1 years ago.

ibge pnadc sipd

28 stars 6.02 score 151 scripts 1 dependents

nicwir

QurvE:Robust and User-Friendly Analysis of Growth and Fluorescence Curves

High-throughput analysis of growth curves and fluorescence data using three methods: linear regression, growth model fitting, and smooth spline fit. Analysis of dose-response relationships via smoothing splines or dose-response models. Complete data analysis workflows can be executed in a single step via user-friendly wrapper functions. The results of these workflows are summarized in detailed reports as well as intuitively navigable 'R' data containers. A 'shiny' application provides access to all features without requiring any programming knowledge. The package is described in further detail in Wirth et al. (2023) <doi:10.1038/s41596-023-00850-7>.

Maintained by Nicolas T. Wirth. Last updated 1 years ago.

25 stars 6.00 score 7 scripts

eu-ecdc

epitweetr:Early Detection of Public Health Threats from 'Twitter' Data

It allows you to automatically monitor trends of tweets by time, place and topic aiming at detecting public health threats early through the detection of signals (e.g. an unusual increase in the number of tweets). It was designed to focus on infectious diseases, and it can be extended to all hazards or other fields of study by modifying the topics and keywords. More information is available in the 'epitweetr' peer-review publication (doi:10.2807/1560-7917.ES.2022.27.39.2200177).

Maintained by Laura Espinosa. Last updated 1 years ago.

early-warning-systems epidemic-surveillance lucene machine-learning signal-detection spark twitter

56 stars 5.98 score 86 scripts

albersonmiranda

fio:Friendly Input-Output Analysis

Simplifies the process of importing and managing input-output matrices from 'Microsoft Excel' into R, and provides a suite of functions for analysis. It leverages the 'R6' class for clean, memory-efficient object-oriented programming. Furthermore, all linear algebra computations are implemented in 'Rust' to achieve highly optimized performance.

Maintained by Alberson da Silva Miranda. Last updated 2 months ago.

economics rust cargo

12 stars 5.98 score 1 scripts

bioc

dar:Differential Abundance Analysis by Consensus

Differential abundance testing in microbiome data challenges both parametric and non-parametric statistical methods, due to its sparsity, high variability and compositional nature. Microbiome-specific statistical methods often assume classical distribution models or take into account compositional specifics. These produce results that range within the specificity vs sensitivity space in such a way that type I and type II error that are difficult to ascertain in real microbiome data when a single method is used. Recently, a consensus approach based on multiple differential abundance (DA) methods was recently suggested in order to increase robustness. With dar, you can use dplyr-like pipeable sequences of DA methods and then apply different consensus strategies. In this way we can obtain more reliable results in a fast, consistent and reproducible way.

Maintained by Francesc Catala-Moll. Last updated 14 days ago.

software sequencing microbiome metagenomics multiplecomparison normalization bioconductor biomarker-discovery differential-abundance-analysis feature-selection microbiology phyloseq

2 stars 5.98 score 8 scripts

ropensci

bikedata:Download and Aggregate Data from Public Hire Bicycle Systems

Download and aggregate data from all public hire bicycle systems which provide open data, currently including 'Santander' Cycles in London, U.K.; from the U.S.A., 'Ford GoBike' in San Francisco CA, 'citibike' in New York City NY, 'Divvy' in Chicago IL, 'Capital Bikeshare' in Washington DC, 'Hubway' in Boston MA, 'Metro' in Los Angeles LA, 'Indego' in Philadelphia PA, and 'Nice Ride' in Minnesota; 'Bixi' from Montreal, Canada; and 'mibici' from Guadalajara, Mexico.

Maintained by Mark Padgham. Last updated 1 years ago.

bicycle-hire-systems bike-hire-systems bike-hire bicycle-hire database bike-data peer-reviewed cpp

81 stars 5.96 score 28 scripts

bioc

autonomics:Unified Statistical Modeling of Omics Data

This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.

Maintained by Aditya Bhagwat. Last updated 2 months ago.

software dataimport preprocessing dimensionreduction principalcomponent regression differentialexpression genesetenrichment transcriptomics transcription geneexpression rnaseq microarray proteomics metabolomics massspectrometry

5.95 score 5 scripts

yonicd

shinyHeatmaply:Deploy 'heatmaply' using 'shiny'

Access functionality of the 'heatmaply' package through 'Shiny UI'.

Maintained by Jonathan Sidi. Last updated 5 years ago.

47 stars 5.95 score 42 scripts 1 dependents

gbganalyst

bulkreadr:The Ultimate Tool for Reading Data in Bulk

Designed to simplify and streamline the process of reading and processing large volumes of data in R, this package offers a collection of functions tailored for bulk data operations. It enables users to efficiently read multiple sheets from Microsoft Excel and Google Sheets workbooks, as well as various CSV files from a directory. The data is returned as organized data frames, facilitating further analysis and manipulation. Ideal for handling extensive data sets or batch processing tasks, bulkreadr empowers users to manage data in bulk effortlessly, saving time and effort in data preparation workflows. Additionally, the package seamlessly works with labelled data from SPSS and Stata.

Maintained by Ezekiel Ogundepo. Last updated 7 months ago.

bulkreader csv-reader data-import googlesheets missing-values xlsxreader

12 stars 5.94 score 12 scripts

bioc

SCOPE:A normalization and copy number estimation method for single-cell DNA sequencing

Whole genome single-cell DNA sequencing (scDNA-seq) enables characterization of copy number profiles at the cellular level. This circumvents the averaging effects associated with bulk-tissue sequencing and has increased resolution yet decreased ambiguity in deconvolving cancer subclones and elucidating cancer evolutionary history. ScDNA-seq data is, however, sparse, noisy, and highly variable even within a homogeneous cell population, due to the biases and artifacts that are introduced during the library preparation and sequencing procedure. Here, we propose SCOPE, a normalization and copy number estimation method for scDNA-seq data. The distinguishing features of SCOPE include: (i) utilization of cell-specific Gini coefficients for quality controls and for identification of normal/diploid cells, which are further used as negative control samples in a Poisson latent factor model for normalization; (ii) modeling of GC content bias using an expectation-maximization algorithm embedded in the Poisson generalized linear models, which accounts for the different copy number states along the genome; (iii) a cross-sample iterative segmentation procedure to identify breakpoints that are shared across cells from the same genetic background.

Maintained by Rujin Wang. Last updated 5 months ago.

singlecell normalization copynumbervariation sequencing wholegenome coverage alignment qualitycontrol dataimport dnaseq

5.92 score 84 scripts

awkena

qrlabelr:Generate Machine- And Human-Readable Plot Labels for Experiments

A no-frills open-source solution for designing plot labels affixed with QR codes. It features 'EasyQrlabelr', a 'BrAPI-compliant' 'shiny' app that simplifies the process of plot label design for non-R users. It builds on the methods described by Wu 'et al.' (2020) <doi:10.1111/2041-210X.13405>.

Maintained by Alexander Wireko Kena. Last updated 1 months ago.

experiments field-plots labels-generator qr-code shiny-apps

17 stars 5.91 score 16 scripts

swissclinicaltrialorganisation

secuTrialR:Handling of Data from the Clinical Data Management System 'secuTrial'

Seamless and standardized interaction with data exported from the clinical data management system (CDMS) 'secuTrial'<https://www.secutrial.com>. The primary data export the package works with is a standard non-rectangular export.

Maintained by Alan G. Haynes. Last updated 10 months ago.

9 stars 5.91 score 15 scripts

flr

FLBEIA:Bio-Economic Impact Assessment of Management Strategies using FLR

A simulation toolbox that describes a fishery system under a Management Strategy Estrategy approach. The objective of the model is to facilitate the Bio-Economic evaluation of Management strategies. It is multistock, multifleet and seasonal. The simulation is divided in 2 main blocks, the Operating Model (OM) and the Management Procedure (MP). In turn, each of these two blocks is divided in 3 components: the biological, the fleets and the covariables on the one hand, and the observation, the assessment and the advice on the other.

Maintained by FLBEIA Team. Last updated 17 days ago.

cpp

11 stars 5.89 score 156 scripts

bioc

miRspongeR:Identification and analysis of miRNA sponge regulation

This package provides several functions to explore miRNA sponge (also called ceRNA or miRNA decoy) regulation from putative miRNA-target interactions or/and transcriptomics data (including bulk, single-cell and spatial gene expression data). It provides eight popular methods for identifying miRNA sponge interactions, and an integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of miRNA sponge modules, and conduct survival analysis of miRNA sponge modules. By using a sample control variable strategy, it provides a function to infer sample-specific miRNA sponge interactions. In terms of sample-specific miRNA sponge interactions, it implements three similarity methods to construct sample-sample correlation network.

Maintained by Junpeng Zhang. Last updated 5 months ago.

geneexpression biomedicalinformatics networkenrichment survival microarray software singlecell spatial rnaseq cerna mirna sponge

5 stars 5.88 score 8 scripts

edhofman

ReSurv:Machine Learning Models For Predicting Claim Counts

Prediction of claim counts using the feature based development factors introduced in the manuscript <doi:10.48550/arXiv.2312.14549>. Implementation of Neural Networks, Extreme Gradient Boosting, and Cox model with splines to optimise the partial log-likelihood of proportional hazard models.

Maintained by Emil Hofman. Last updated 5 months ago.

2 stars 5.87 score 21 scripts

cardiomoon

ggplotAssist:'RStudio' Addin for Teaching and Learning 'ggplot2'

An 'RStudio' addin for teaching and learning making plot using the 'ggplot2' package. You can learn each steps of making plot by clicking your mouse without coding. You can get resultant code for the plot.

Maintained by Keon-Woong Moon. Last updated 7 years ago.

79 stars 5.85 score 18 scripts

quexiang

OpenMindat:Quickly Retrieve Datasets from the 'Mindat' API

The goal of OpenMindat R package is to provide functions for users or machines to quickly and easily retrieve datasets from the mindat.org API (<https://api.mindat.org/schema/redoc/>).

Maintained by Xiang Que. Last updated 2 months ago.

34 stars 5.83 score 3 scripts

bioc

ISAnalytics:Analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies

In gene therapy, stem cells are modified using viral vectors to deliver the therapeutic transgene and replace functional properties since the genetic modification is stable and inherited in all cell progeny. The retrieval and mapping of the sequences flanking the virus-host DNA junctions allows the identification of insertion sites (IS), essential for monitoring the evolution of genetically modified cells in vivo. A comprehensive toolkit for the analysis of IS is required to foster clonal trackign studies and supporting the assessment of safety and long term efficacy in vivo. This package is aimed at (1) supporting automation of IS workflow, (2) performing base and advance analysis for IS tracking (clonal abundance, clonal expansions and statistics for insertional mutagenesis, etc.), (3) providing basic biology insights of transduced stem cells in vivo.

Maintained by Francesco Gazzo. Last updated 4 months ago.

biomedicalinformatics sequencing singlecell

3 stars 5.83 score 15 scripts

mrc-ide

hintr:R API for calling naomi district level HIV model

R API for calling naomi district level HIV model.

Maintained by Robert Ashton. Last updated 18 days ago.

2 stars 5.80 score 2 scripts 1 dependents

bioc

bioCancer:Interactive Multi-Omics Cancers Data Visualization and Analysis

This package is a Shiny App to visualize and analyse interactively Multi-Assays of Cancer Genomic Data.

Maintained by Karim Mezhoud. Last updated 5 months ago.

gui datarepresentation network multiplecomparison pathways reactome visualization geneexpression genetarget analysis biocancer-interface cancer cancer-studies rmarkdown

20 stars 5.78 score 7 scripts

bioc

benchdamic:Benchmark of differential abundance methods on microbiome data

Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.

Maintained by Matteo Calgaro. Last updated 4 months ago.

metagenomics microbiome differentialexpression multiplecomparison normalization preprocessing software benchmark differential-abundance-methods

8 stars 5.78 score 8 scripts

richardli

surveyPrev:Mapping the Prevalence of Binary Indicators using Survey Data in Small Areas

Provides a pipeline to perform small area estimation and prevalence mapping of binary indicators using health and demographic survey data, described in Fuglstad et al. (2022) <doi:10.48550/arXiv.2110.09576> and Wakefield et al. (2020) <doi:10.1111/insr.12400>.

Maintained by Qianyu Dong. Last updated 22 hours ago.

1 stars 5.76 score 11 scripts

bioc

limpca:An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods

This package has for objectives to provide a method to make Linear Models for high-dimensional designed data. limpca applies a GLM (General Linear Model) version of ASCA and APCA to analyse multivariate sample profiles generated by an experimental design. ASCA/APCA provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design and contrarily to MANOVA, it can deal with mutlivariate datasets having more variables than observations. This method can handle unbalanced design.

Maintained by Manon Martin. Last updated 5 months ago.

statisticalmethod principalcomponent regression visualization experimentaldesign multiplecomparison geneexpression metabolomics

2 stars 5.73 score 2 scripts

shichenxie

pedquant:Public Economic Data and Quantitative Analysis

Provides an interface to access public economic and financial data for economic research and quantitative analysis. The data sources including NBS, FRED, Sina, Eastmoney and etc. It also provides quantitative functions for trading strategies based on the 'data.table', 'TTR', 'PerformanceAnalytics' and etc packages.

Maintained by Shichen Xie. Last updated 15 days ago.

59 stars 5.70 score 34 scripts

dpc10ster

RJafroc:Artificial Intelligence Systems and Observer Performance

Analyzing the performance of artificial intelligence (AI) systems/algorithms characterized by a 'search-and-report' strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where lesion localization information is used. A book using the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: <https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840>. Online updates to this book, which use the software, are at <https://dpc10ster.github.io/RJafrocQuickStart/>, <https://dpc10ster.github.io/RJafrocRocBook/> and at <https://dpc10ster.github.io/RJafrocFrocBook/>. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict 'proper' ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, <https://github.com/dpc10ster/WindowsJafroc>. Package functions are organized as follows. Data file related function names are preceded by 'Df', curve fitting functions by 'Fit', included data sets by 'dataset', plotting functions by 'Plot', significance testing functions by 'St', sample size related functions by 'Ss', data simulation functions by 'Simulate' and utility functions by 'Util'. Implemented are figures of merit (FOMs) for quantifying performance and functions for visualizing empirical or fitted operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via either Dorfman-Berbaum-Metz or the Obuchowski-Rockette methods. Also implemented is single modality analysis, which allows comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed modality factors and the aim is to determined performance in each modality factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification.

Maintained by Dev Chakraborty. Last updated 5 months ago.

ai-optimization artificial-intelligence-algorithms computer-aided-diagnosis froc-analysis roc-analysis target-classification target-localization cpp

19 stars 5.69 score 65 scripts

verbal-autopsy-software

InSilicoVA:Probabilistic Verbal Autopsy Coding with 'InSilicoVA' Algorithm

Computes individual causes of death and population cause-specific mortality fractions using the 'InSilicoVA' algorithm from McCormick et al. (2016) <DOI:10.1080/01621459.2016.1152191>. It uses data derived from verbal autopsy (VA) interviews, in a format similar to the input of the widely used 'InterVA' method. This package provides general model fitting and customization for 'InSilicoVA' algorithm and basic graphical visualization of the output.

Maintained by Zehang Richard Li. Last updated 1 months ago.

va-algorithm openjdk

3 stars 5.67 score 35 scripts 1 dependents

epiforecasts

covidregionaldata:Subnational Data for COVID-19 Epidemiology

An interface to subnational and national level COVID-19 data sourced from both official sources, such as Public Health England in the UK, and from other COVID-19 data collections, including the World Health Organisation (WHO), European Centre for Disease Prevention and Control (ECDC), John Hopkins University (JHU), Google Open Data and others. Designed to streamline COVID-19 data extraction, cleaning, and processing from a range of data sources in an open and transparent way. This allows users to inspect and scrutinise the data, and tools used to process it, at every step. For all countries supported, data includes a daily time-series of cases. Wherever available data is also provided for deaths, hospitalisations, and tests. National level data are also supported using a range of sources.

Maintained by Sam Abbott. Last updated 3 years ago.

covid-19 data open-science r6 regional-data

37 stars 5.67 score 121 scripts

xiaozhangryy

CAESAR.Suite:CAESAR: a Cross-Technology and Cross-Resolution Framework for Spatial Omics Annotation

Biotechnology in spatial omics has advanced rapidly over the past few years, enhancing both throughput and resolution. However, existing annotation pipelines in spatial omics predominantly rely on clustering methods, lacking the flexibility to integrate extensive annotated information from single-cell RNA sequencing (scRNA-seq) due to discrepancies in spatial resolutions, species, or modalities. Here we introduce the CAESAR suite, an open-source software package that provides image-based spatial co-embedding of locations and genomic features. It uniquely transfers labels from scRNA-seq reference, enabling the annotation of spatial omics datasets across different technologies, resolutions, species, and modalities, based on the conserved relationship between signature genes and cells/locations at an appropriate level of granularity. Notably, CAESAR enriches location-level pathways, allowing for the detection of gradual biological pathway activation within spatially defined domain types. More details on the methods related to our paper currently under submission. A full reference to the paper will be provided in future versions once the paper is published.

Maintained by Xiao Zhang. Last updated 5 days ago.

openblas cpp

1 stars 5.67 score 2 scripts

cadam00

prior3D:3D Prioritization Algorithm

Three-dimensional systematic conservation planning, conducting nested prioritization analyses across multiple depth levels and ensuring efficient resource allocation throughout the water column. It provides a structured workflow designed to address biodiversity conservation and management challenges in the 3 dimensions, while facilitating users’ choices and parameterization (Doxa et al. 2025 <doi:10.1016/j.ecolmodel.2024.110919>).

Maintained by Christos Adam. Last updated 2 months ago.

biodiversity conservation conservation-planning depth marine-spatial-planning multidimensional-environments prioritization

6 stars 5.62 score 3 scripts

bioc

MSstatsLiP:LiP Significance Analysis in shotgun mass spectrometry-based proteomic experiments

Tools for LiP peptide and protein significance analysis. Provides functions for summarization, estimation of LiP peptide abundance, and detection of changes across conditions. Utilizes functionality across the MSstats family of packages.

Maintained by Devon Kohler. Last updated 5 months ago.

immunooncology massspectrometry proteomics software differentialexpression onechannel twochannel normalization qualitycontrol cpp

7 stars 5.62 score 5 scripts

bioc

gpuMagic:An openCL compiler with the capacity to compile R functions and run the code on GPU

The package aims to help users write openCL code with little or no effort. It is able to compile an user-defined R function and run it on a device such as a CPU or a GPU. The user can also write and run their openCL code directly by calling .kernel function.

Maintained by Jiefei Wang. Last updated 5 months ago.

infrastructure ocl-icd cpp

10 stars 5.60 score 1 scripts

pik-piam

mrland:MadRaT land data package

The package provides land related data via the madrat framework.

Maintained by Jan Philipp Dietrich. Last updated 9 days ago.

5.59 score 3 scripts 4 dependents

maelstrom-research

Rmonize:Support Retrospective Harmonization of Data

Functions to support rigorous retrospective data harmonization processing, evaluation, and documentation across datasets from different studies based on Maelstrom Research guidelines. The package includes the core functions to evaluate and format the main inputs that define the harmonization process, apply specified processing rules to generate harmonized data, diagnose processing errors, and summarize and evaluate harmonized outputs. The main inputs that define the processing are a DataSchema (list and definitions of harmonized variables to be generated) and Data Processing Elements (processing rules to be applied to generate harmonized variables from study-specific variables). The main outputs of processing are harmonized datasets, associated metadata, and tabular and visual summary reports. As described in Maelstrom Research guidelines for rigorous retrospective data harmonization (Fortier I and al. (2017) <doi:10.1093/ije/dyw075>).

Maintained by Guillaume Fabre. Last updated 1 years ago.

5 stars 5.58 score 51 scripts

epe-gov-br

epe4md:EPE's 4MD model to forecast the adoption of Distributed Generation and Behind-the-meter energy storage

EPE's 4MD model to forecast the adoption of Distributed Generation and Behind-the-meter energy storage

Maintained by Gabriel Konzen. Last updated 18 days ago.

19 stars 5.58 score 5 scripts

radiant-rstats

radiant.basics:Basics Menu for Radiant: Business Analytics using R and Shiny

The Radiant Basics menu includes interfaces for probability calculation, central limit theorem simulation, comparing means and proportions, goodness-of-fit testing, cross-tabs, and correlation. The application extends the functionality in 'radiant.data'.

Maintained by Vincent Nijs. Last updated 11 months ago.

8 stars 5.56 score 79 scripts 3 dependents

mrcieu

mrbayes:Bayesian Summary Data Models for Mendelian Randomization Studies

Bayesian estimation of inverse variance weighted (IVW), Burgess et al. (2013) <doi:10.1002/gepi.21758>, and MR-Egger, Bowden et al. (2015) <doi:10.1093/ije/dyv080>, summary data models for Mendelian randomization analyses.

Maintained by Tom Palmer. Last updated 13 days ago.

cpp

4 stars 5.56 score 2 scripts

bioc

multicrispr:Multi-locus multi-purpose Crispr/Cas design

This package is for designing Crispr/Cas9 and Prime Editing experiments. It contains functions to (1) define and transform genomic targets, (2) find spacers (4) count offtarget (mis)matches, and (5) compute Doench2016/2014 targeting efficiency. Care has been taken for multicrispr to scale well towards large target sets, enabling the design of large Crispr/Cas9 libraries.

Maintained by Aditya Bhagwat. Last updated 4 months ago.

crispr software

5.56 score 2 scripts

luisdva

forgts:Convert Formatted Spreadsheets to Presentation-Ready Display Tables

Reads cell contents plus formatting from a spreadsheet file and creates an editable 'gt' object with the same data and formatting. Supports the most commonly-used cell and text styles including colors, fills, font weights and decorations, and borders.

Maintained by Luis D. Verde Arregoitia. Last updated 2 months ago.

18 stars 5.56 score 3 scripts

ropensci

hddtools:Hydrological Data Discovery Tools

Tools to discover hydrological data, accessing catalogues and databases from various data providers. The package is described in Vitolo (2017) "hddtools: Hydrological Data Discovery Tools" <doi:10.21105/joss.00056>.

Maintained by Dorothea Hug Peter. Last updated 7 months ago.

data60uk grdc hydrology kgclimateclass mopex peer-reviewed precipitation sepa

48 stars 5.56 score 25 scripts

glottospace

glottospace:Language Mapping and Geospatial Analysis of Linguistic and Cultural Data

Streamlined workflows for geolinguistic analysis, including: accessing global linguistic and cultural databases, data import, data entry, data cleaning, data exploration, mapping, visualization and export.

Maintained by Rui Dong. Last updated 3 months ago.

23 stars 5.54 score 6 scripts

dkneis

rodeo:A Code Generator for ODE-Based Models

Provides an R6 class and several utility methods to facilitate the implementation of models based on ordinary differential equations. The heart of the package is a code generator that creates compiled 'Fortran' (or 'R') code which can be passed to a numerical solver. There is direct support for solvers contained in packages 'deSolve' and 'rootSolve'.

Maintained by David Kneis. Last updated 4 months ago.

7 stars 5.53 score 24 scripts

mponce0

covid19.analytics:Load and Analyze Live Data from the COVID-19 Pandemic

Load and analyze updated time series worldwide data of reported cases for the Novel Coronavirus Disease (COVID-19) from different sources, including the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) data repository <https://github.com/CSSEGISandData/COVID-19>, "Our World in Data" <https://github.com/owid/> among several others. The datasets reporting the COVID-19 cases are available in two main modalities, as a time series sequences and aggregated data for the last day with greater spatial resolution. Several analysis, visualization and modelling functions are available in the package that will allow the user to compute and visualize total number of cases, total number of changes and growth rate globally or for an specific geographical location, while at the same time generating models using these trends; generate interactive visualizations and generate Susceptible-Infected-Recovered (SIR) model for the disease spread.

Maintained by Marcelo Ponce. Last updated 3 years ago.

covid19 covid19-data ncov2019 sars-cov-2

33 stars 5.52 score 20 scripts

bioc

methylclock:Methylclock - DNA methylation-based clocks

This package allows to estimate chronological and gestational DNA methylation (DNAm) age as well as biological age using different methylation clocks. Chronological DNAm age (in years) : Horvath's clock, Hannum's clock, BNN, Horvath's skin+blood clock, PedBE clock and Wu's clock. Gestational DNAm age : Knight's clock, Bohlin's clock, Mayne's clock and Lee's clocks. Biological DNAm clocks : Levine's clock and Telomere Length's clock.

Maintained by Dolors Pelegri-Siso. Last updated 5 months ago.

dnamethylation biologicalquestion preprocessing statisticalmethod normalization cpp

39 stars 5.52 score 28 scripts

bioc

GeoDiff:Count model based differential expression and normalization on GeoMx RNA data

A series of statistical models using count generating distributions for background modelling, feature and sample QC, normalization and differential expression analysis on GeoMx RNA data. The application of these methods are demonstrated by example data analysis vignette.

Maintained by Nicole Ortogero. Last updated 5 months ago.

geneexpression differentialexpression normalization openblas cpp openmp

8 stars 5.51 score 9 scripts

jepusto

scdhlm:Estimating Hierarchical Linear Models for Single-Case Designs

Provides a set of tools for estimating hierarchical linear models and effect sizes based on data from single-case designs. Functions are provided for calculating standardized mean difference effect sizes that are directly comparable to standardized mean differences estimated from between-subjects randomized experiments, as described in Hedges, Pustejovsky, and Shadish (2012) <DOI:10.1002/jrsm.1052>; Hedges, Pustejovsky, and Shadish (2013) <DOI:10.1002/jrsm.1086>; Pustejovsky, Hedges, and Shadish (2014) <DOI:10.3102/1076998614547577>; and Chen, Pustejovsky, Klingbeil, and Van Norman (2023) <DOI:10.1016/j.jsp.2023.02.002>. Includes an interactive web interface.

Maintained by James Pustejovsky. Last updated 1 years ago.

4 stars 5.49 score 52 scripts

mcanouil

insane:INsulin Secretion ANalysEr

A user-friendly interface, using Shiny, to analyse glucose-stimulated insulin secretion (GSIS) assays in pancreatic beta cells or islets. The package allows the user to import several sets of experiments from different spreadsheets and to perform subsequent steps: summarise in a tidy format, visualise data quality and compare experimental conditions without omitting to account for technical confounders such as the date of the experiment or the technician. Together, insane is a comprehensive method that optimises pre-processing and analyses of GSIS experiments in a friendly-user interface. The Shiny App was initially designed for EndoC-betaH1 cell line following method described in Ndiaye et al., 2017 (<doi:10.1016/j.molmet.2017.03.011>).

Maintained by Mickaël Canouil. Last updated 3 months ago.

beta-cells endoc-betah1 insulin-secretion pancreas shiny statistics stats

3 stars 5.48 score 4 scripts

ropensci

EndoMineR:Functions to mine endoscopic and associated pathology datasets

This script comprises the functions that are used to clean up endoscopic reports and pathology reports as well as many of the scripts used for analysis. The scripts assume the endoscopy and histopathology data set is merged already but it can also be used of course with the unmerged datasets.

Maintained by Sebastian Zeki. Last updated 7 months ago.

endoscopy gastroenterology peer-reviewed semi-structured-data text-mining

13 stars 5.47 score 30 scripts

molinlab

Holomics:An User-Friendly R 'shiny' Application for Multi-Omics Data Integration and Analysis

A 'shiny' application, which allows you to perform single- and multi-omics analyses using your own omics datasets. After the upload of the omics datasets and a metadata file, single-omics is performed for feature selection and dataset reduction. These datasets are used for pairwise- and multi-omics analyses, where automatic tuning is done to identify correlations between the datasets - the end goal of the recommended 'Holomics' workflow. Methods used in the package were implemented in the package 'mixomics' by Florian Rohart,Benoît Gautier,Amrit Singh,Kim-Anh Lê Cao (2017) <doi:10.1371/journal.pcbi.1005752> and are described there in further detail.

Maintained by Katharina Munk. Last updated 10 months ago.

7 stars 5.45 score 7 scripts

pik-piam

mrindustry:input data generation for the REMIND industry module

The mrindustry packages contains data preprocessing for the REMIND model.

Maintained by Falk Benke. Last updated 8 days ago.

5.43 score 2 dependents