R-universe search: contributions

package

owner

contributor

author

maintainer

topic

needs

exports

data

Currently serving26318packages,22487articles, and64222datasets by1261organizations,13659 maintainers and22065 contributors.

Not sure what to search for? Why not try:maps, bayesian, ecology, climate, genome, gam, spatial, database, pdf, shiny, rstudio, machine learning, prediction, birds, fish, sports, ... (more popular topics)

Organizations

vimc

lcbc-uio

stan-dev

pharmaverse

r-spatial

tidyverse

ropengov

rstudio

r-lib

ropensci

bioc

r-forge

kwb-r

pik-piam

hypertidy

poissonconsulting

mrc-ide

tidymodels

pecanproject

insightsengineering

thinkr-open

inbo

mlr-org

ggseg

ohdsi

modeloriented

predictiveecology

paws-r

flr

ropenspain

sciviews

bnosac

openvolley

rmi-pacta

mrcieu

repboxr

nlmixr2

epiverse-trace

yulab-smu

frbcesab

ices-tools-prod

statnet

appsilon

azure

riatelab

bips-hb

mlverse

cloudyr

rjdverse

epiforecasts

tmsalab

openpharma

hubverse-org

usaid-oha-si

usepa

bupaverse

dreamrs

certe-medical-epidemiology

darwin-eu

easystats

ambiorix-web

business-science

merck

coatless-rpkg

hugheylab

rikenbit

r-dbi

uscbiostats

spatstat

bluegreen-labs

nutriverse

rsquaredacademy

ctu-bern

biometris

epicentre-msf

nflverse

ipeagit

ocbe-uio

ifpri

humaniverse

rspatial

apache

terminological

cogdisreslab

data-cleaning

reconhub

gesistsa

quanteda

cynkra

piecepackr

statisticsnorway

kharchenkolab

oxfordihtm

tlverse

idslme

decisionpatterns

Want to learn more about r-universe? Have a look atropensci.org/r-universeor updates from the rOpenSci blog:

Better documentation for R-universe!February 28, 2025
R-Universe Named an R Consortium Top-Level ProjectDecember 3, 2024
Capturing Screenshots Programmatically With RSeptember 10, 2024
Navigating the R ecosystem using R-universeSeptember 24, 2024
A fresh new look for R-universe!June 12, 2024
R-Universe Documentation Gets a Boost from Google Season of DocsApril 12, 2024
R-universe now builds MacOS ARM64 binaries for use on Apple Silicon (aka M1/M2/M3) systemsJanuary 14, 2024
R-universe now builds WASM binaries for all R packagesNovember 17, 2023
The rOpenSci MultiverseNovember 6, 2023
CRAN-ial Expansion: Taking Your R Package Development to New Frontiers with R-UniverseSeptember 19, 2023
Meeting the Stars of the R-Universe: The R-Universe Against Diseases.September 15, 2023
My Life with the R-universeAugust 1, 2023
New cran.dev shortlinks to package information and documentationJuly 26, 2023
Meeting the Stars of the R-Universe: PEcAn, an Open Source Project to Take Care of the PlanetJune 6, 2023
Downloading snapshots and creating stable R packages repositories using r-universeMay 31, 2023
How r-universe searches for packages on CRAN / BioconductorApril 3, 2023
Meeting the Stars of the R-Universe: Researching Our Brain with the Magic of the R-UniverseMarch 30, 2023
Meeting the Stars of the R-universe: ThinkR's Approach to Contributing to a Growing and Friendly R CommunityFebruary 28, 2023
Discovering and learning everything there is to know about R packages using r-universeFebruary 27, 2023
New preferred repo name for r-universe registriesFebruary 7, 2023
Improved permanent URL schema for r-universe.devJanuary 30, 2023
postdoc 1.0: minimal and uncluttered HTML package manualsNovember 29, 2022
Meeting the stars of the R-universe: R Community, Exchange and LearnNovember 23, 2022
Searching and browsing the R universeMarch 23, 2022
A Blend of Package Build FailuresJanuary 31, 2022
How renv restores packages from r-universe for reproducibility or productionJanuary 6, 2022
RSS feeds of package updates in r-universeNovember 24, 2021
How I Test cffr on (about) 2,000 Packages using GitHub Actions and R-universeNovember 23, 2021
Generating and customizing badges in r-universeOctober 14, 2021
rOpenSci docs are now built on r-universeSeptember 3, 2021
How to create your personal CRAN-like repository on R-universeJune 22, 2021
Publishing and browsing articles on R-universeApril 9, 2021
rOpenSci's R-universe ProjectMay 25, 2021
A first look at the R-universe build infrastructureMarch 4, 2021
Moving away from Travis CINovember 19, 2020
How to precompute package vignettes or pkgdown articlesDecember 8, 2019

Showing 200 of total 643 results (show query)

openbiox

contribution:A Tiny Contribution Table Generator Based on 'ggplot2'

Contribution table for credit assignment based on 'ggplot2'. This can improve the author contribution information in academic journals and personal CV.

Maintained by Shixiang Wang. Last updated 2 years ago.

contribution credit ggplot2 research

74.4 match 11 stars 5.20 score 29 scripts

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 11 months ago.

120.7 match 2.28 score 38 scripts

braverock

PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis

Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.

Maintained by Brian G. Peterson. Last updated 3 months ago.

14.9 match 222 stars 15.93 score 4.8k scripts 20 dependents

arnaudgallou

plume:A Simple Author Handler for Scientific Writing

Handles and formats author information in scientific writing in 'R Markdown' and 'Quarto'. 'plume' provides easy-to-use and flexible tools for injecting author metadata in 'YAML' headers as well as generating author and contribution lists (among others) as strings from tabular data.

Maintained by Arnaud Gallou. Last updated 30 days ago.

authors contribution contributions list lists markdown paper preprint quarto role roles

22.5 match 21 stars 6.84 score 15 scripts

nicholasjclark

mvgam:Multivariate (Dynamic) Generalized Additive Models

Fit Bayesian Dynamic Generalized Additive Models to multivariate observations. Users can build nonlinear State-Space models that can incorporate semiparametric effects in observation and process components, using a wide range of observation families. Estimation is performed using Markov Chain Monte Carlo with Hamiltonian Monte Carlo in the software 'Stan'. References: Clark & Wells (2023) <doi:10.1111/2041-210X.13974>.

Maintained by Nicholas J Clark. Last updated 1 days ago.

bayesian-statistics dynamic-factor-models ecological-modelling forecasting gaussian-process generalised-additive-models generalized-additive-models joint-species-distribution-modelling multilevel-models multivariate-timeseries stan time-series-analysis timeseries vector-autoregression vectorautoregression cpp

12.3 match 139 stars 9.85 score 117 scripts

bioc

SPIAT:Spatial Image Analysis of Tissues

SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.

Maintained by Yuzhou Feng. Last updated 2 days ago.

biomedicalinformatics cellbiology spatial clustering dataimport immunooncology qualitycontrol singlecell software visualization

12.0 match 22 stars 8.59 score 69 scripts

marberts

piar:Price Index Aggregation

Most price indexes are made with a two-step procedure, where period-over-period elemental indexes are first calculated for a collection of elemental aggregates at each point in time, and then aggregated according to a price index aggregation structure. These indexes can then be chained together to form a time series that gives the evolution of prices with respect to a fixed base period. This package contains a collection of functions that revolve around this work flow, making it easy to build standard price indexes, and implement the methods described by Balk (2008, <doi:10.1017/CBO9780511720758>), von der Lippe (2007, <doi:10.3726/978-3-653-01120-3>), and the CPI manual (2020, <doi:10.5089/9781484354841.069>) for bilateral price indexes.

Maintained by Steve Martin. Last updated 14 days ago.

economics inflation official-statistics statistics

11.4 match 4 stars 7.32 score 25 scripts

ropensci

git2r:Provides Access to Git Repositories

Interface to the 'libgit2' library, which is a pure C implementation of the 'Git' core methods. Provides access to 'Git' repositories to extract data and running some basic 'Git' commands.

Maintained by Stefan Widgren. Last updated 11 days ago.

git git-client libgit2 libgit2-library

6.0 match 218 stars 13.86 score 836 scripts 49 dependents

gbradburd

conStruct:Models Spatially Continuous and Discrete Population Genetic Structure

A method for modeling genetic data as a combination of discrete layers, within each of which relatedness may decay continuously with geographic distance. This package contains code for running analyses (which are implemented in the modeling language 'rstan') and visualizing and interpreting output. See the paper for more details on the model and its utility.

Maintained by Gideon Bradburd. Last updated 1 years ago.

cpp

8.0 match 35 stars 8.39 score 70 scripts

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

9.1 match 7.27 score 251 scripts 1 dependents

cran

BAT:Biodiversity Assessment Tools

Includes algorithms to assess alpha and beta diversity in all their dimensions (taxonomic, phylogenetic and functional). It allows performing a number of analyses based on species identities/abundances, phylogenetic/functional distances, trees, convex-hulls or kernel density n-dimensional hypervolumes depicting species relationships. Cardoso et al. (2015) <doi:10.1111/2041-210X.12310>.

Maintained by Pedro Cardoso. Last updated 1 years ago.

20.5 match 3.17 score 3 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 16 days ago.

ecological-modelling ecology ordination fortran openblas

3.3 match 472 stars 19.41 score 15k scripts 440 dependents

trivialfis

xgboost:Extreme Gradient Boosting

Extreme Gradient Boosting, which is an efficient implementation of the gradient boosting framework from Chen & Guestrin (2016) <doi:10.1145/2939672.2939785>. This package is its R interface. The package includes efficient linear model solver and tree learning algorithms. The package can automatically do parallel computation on a single machine which could be more than 10 times faster than existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is made to be extensible, so that users are also allowed to define their own objectives easily.

Maintained by Jiaming Yuan. Last updated 8 months ago.

cpp openmp

5.4 match 6 stars 11.70 score 13k scripts 112 dependents

baumer-lab

fec16:Data Package for the 2016 United States Federal Elections

Easily analyze relational data from the United States 2016 federal election cycle as reported by the Federal Election Commission. This package contains data about candidates, committees, and a variety of different financial expenditures. Data is from <https://www.fec.gov/data/browse-data/?tab=bulk-data>.

Maintained by Marium Tapal. Last updated 2 years ago.

11.6 match 2 stars 5.15 score 47 scripts

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

7.3 match 3 stars 8.20 score 7.8k scripts 11 dependents

nicolas-robette

GDAtools:Geometric Data Analysis

Many tools for Geometric Data Analysis (Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0>), such as MCA variants (Specific Multiple Correspondence Analysis, Class Specific Analysis), many graphical and statistical aids to interpretation (structuring factors, concentration ellipses, inductive tests, bootstrap validation, etc.) and multiple-table analysis (Multiple Factor Analysis, between- and inter-class analysis, Principal Component Analysis and Correspondence Analysis with Instrumental Variables, etc.).

Maintained by Nicolas Robette. Last updated 10 months ago.

10.1 match 10 stars 5.93 score 94 scripts 2 dependents

ropensci

charlatan:Make Fake Data

Make fake data that looks realistic, supporting addresses, person names, dates, times, colors, coordinates, currencies, digital object identifiers ('DOIs'), jobs, phone numbers, 'DNA' sequences, doubles and integers from distributions and within a range.

Maintained by Roel M. Hogervorst. Last updated 1 months ago.

data dataset fake-data faker peer-reviewed

5.8 match 296 stars 10.06 score 180 scripts 1 dependents

palaeoverse

palaeoverse:Prepare and Explore Data for Palaeobiological Analyses

Provides functionality to support data preparation and exploration for palaeobiological analyses, improving code reproducibility and accessibility. The wider aim of 'palaeoverse' is to bring the palaeobiological community together to establish agreed standards. The package currently includes functionality for data cleaning, binning (time and space), exploration, summarisation and visualisation. Reference datasets (i.e. Geological Time Scales <https://stratigraphy.org/chart>) and auxiliary functions are also provided. Details can be found in: Jones et al., (2023) <doi: 10.1111/2041-210X.14099>.

Maintained by Lewis A. Jones. Last updated 5 months ago.

biodiversity fossil palaeobiology paleobiology

6.6 match 21 stars 8.57 score 44 scripts 1 dependents

bioc

DropletUtils:Utilities for Handling Single-Cell Droplet Data

Provides a number of utility functions for handling single-cell (RNA-seq) data from droplet technologies such as 10X Genomics. This includes data loading from count matrices or molecule information files, identification of cells from empty droplets, removal of barcode-swapped pseudo-cells, and downsampling of the count matrix.

Maintained by Jonathan Griffiths. Last updated 3 months ago.

immunooncology singlecell sequencing rnaseq geneexpression transcriptomics dataimport coverage zlib cpp

5.6 match 10.08 score 2.7k scripts 9 dependents

rsquaredacademy

olsrr:Tools for Building OLS Regression Models

Tools designed to make it easier for users, particularly beginner/intermediate R users to build ordinary least squares regression models. Includes comprehensive regression output, heteroskedasticity tests, collinearity diagnostics, residual diagnostics, measures of influence, model fit assessment and variable selection procedures.

Maintained by Aravind Hebbali. Last updated 4 months ago.

collinearity-diagnostics linear-models regression stepwise-regression

4.4 match 103 stars 12.19 score 1.4k scripts 4 dependents

best-practice-and-impact

aftables:Create Spreadsheet Publications Following Best Practice

Generate spreadsheet publications that follow best practice guidance from the UK government's Analysis Function, available at <https://analysisfunction.civilservice.gov.uk/policy-store/releasing-statistics-in-spreadsheets/>, with a focus on accessibility. See also the 'Python' package 'gptables'.

Maintained by Olivia Box Power. Last updated 24 days ago.

accessibility openxlsx reproducible-analytical-pipeline spreadsheet uk-gov-data-science

8.0 match 44 stars 6.72 score 4 scripts

rsoc

soc.ca:Specific Correspondence Analysis for the Social Sciences

Specific and class specific multiple correspondence analysis on survey-like data. Soc.ca is optimized to the needs of the social scientist and presents easily interpretable results in near publication ready quality.

Maintained by Anton Grau Larsen. Last updated 1 years ago.

12.8 match 14 stars 4.15 score 50 scripts

cran

optiSel:Optimum Contribution Selection and Population Genetics

A framework for the optimization of breeding programs via optimum contribution selection and mate allocation. An easy to use set of function for computation of optimum contributions of selection candidates, and of the population genetic parameters to be optimized. These parameters can be estimated using pedigree or genotype information, and include kinships, kinships at native haplotype segments, and breed composition of crossbred individuals. They are suitable for managing genetic diversity, removing introgressed genetic material, and accelerating genetic gain. Additionally, functions are provided for computing genetic contributions from ancestors, inbreeding coefficients, the native effective size, the native genome equivalent, pedigree completeness, and for preparing and plotting pedigrees. The methods are described in:\n Wellmann, R., and Pfeiffer, I. (2009) <doi:10.1017/S0016672309000202>.\n Wellmann, R., and Bennewitz, J. (2011) <doi:10.2527/jas.2010-3709>.\n Wellmann, R., Hartwig, S., Bennewitz, J. (2012) <doi:10.1186/1297-9686-44-34>.\n de Cara, M. A. R., Villanueva, B., Toro, M. A., Fernandez, J. (2013) <doi:10.1111/mec.12560>.\n Wellmann, R., Bennewitz, J., Meuwissen, T.H.E. (2014) <doi:10.1017/S0016672314000196>.\n Wellmann, R. (2019) <doi:10.1186/s12859-018-2450-5>.

Maintained by Robin Wellmann. Last updated 8 months ago.

cpp

14.6 match 1 stars 3.62 score

nschiett

fishualize:Color Palettes Based on Fish Species

Implementation of color palettes based on fish species.

Maintained by Nina M. D. Schiettekatte. Last updated 11 months ago.

6.2 match 155 stars 8.54 score 370 scripts

bioc

tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

Maintained by Timothy Keyes. Last updated 5 months ago.

singlecell flowcytometry bioinformatics cytometry data-science single-cell tidyverse cpp

6.9 match 19 stars 7.26 score 35 scripts

robustport

facmodCS:Cross-Section Factor Models

Linear cross-section factor model fitting with least-squares and robust fitting the 'lmrobdetMM()' function from 'RobStatTM'; related volatility, Value at Risk and Expected Shortfall risk and performance attribution (factor-contributed vs idiosyncratic returns); tabular displays of risk and performance reports; factor model Monte Carlo. The package authors would like to thank Chicago Research on Security Prices,LLC for the cross-section of about 300 CRSP stocks data (in the data.table object 'stocksCRSP', and S&P GLOBAL MARKET INTELLIGENCE for contributing 14 factor scores (a.k.a "alpha factors".and "factor exposures") fundamental data on the 300 companies in the data.table object 'factorSPGMI'. The 'stocksCRSP' and 'factorsSPGMI' data are not covered by the GPL-2 license, are not provided as open source of any kind, and they are not to be redistributed in any form.

Maintained by Mido Shammaa. Last updated 1 years ago.

14.8 match 3.18 score 2 scripts

epiverse-trace

epiparameter:Classes and Helper Functions for Working with Epidemiological Parameters

Classes and helper functions for loading, extracting, converting, manipulating, plotting and aggregating epidemiological parameters for infectious diseases. Epidemiological parameters extracted from the literature are loaded from the 'epiparameterDB' R package.

Maintained by Joshua W. Lambert. Last updated 2 months ago.

data-access data-package epidemiology epiverse probability-distribution

4.8 match 33 stars 9.84 score 102 scripts 1 dependents

klmr

box:Write Reusable, Composable and Modular R Code

A modern module system for R. Organise code into hierarchical, composable, reusable modules, and use it effortlessly across projects via a flexible, declarative dependency loading syntax.

Maintained by Konrad Rudolph. Last updated 12 days ago.

modules packages

3.8 match 888 stars 12.39 score 47 scripts 4 dependents

lleisong

itsdm:Isolation Forest-Based Presence-Only Species Distribution Modeling

Collection of R functions to do purely presence-only species distribution modeling with isolation forest (iForest) and its variations such as Extended isolation forest and SCiForest. See the details of these methods in references: Liu, F.T., Ting, K.M. and Zhou, Z.H. (2008) <doi:10.1109/ICDM.2008.17>, Hariri, S., Kind, M.C. and Brunner, R.J. (2019) <doi:10.1109/TKDE.2019.2947676>, Liu, F.T., Ting, K.M. and Zhou, Z.H. (2010) <doi:10.1007/978-3-642-15883-4_18>, Guha, S., Mishra, N., Roy, G. and Schrijvers, O. (2016) <https://proceedings.mlr.press/v48/guha16.html>, Cortes, D. (2021) <arXiv:2110.13402>. Additionally, Shapley values are used to explain model inputs and outputs. See details in references: Shapley, L.S. (1953) <doi:10.1515/9781400881970-018>, Lundberg, S.M. and Lee, S.I. (2017) <https://dl.acm.org/doi/abs/10.5555/3295222.3295230>, Molnar, C. (2020) <ISBN:978-0-244-76852-2>, Štrumbelj, E. and Kononenko, I. (2014) <doi:10.1007/s10115-013-0679-x>. itsdm also provides functions to diagnose variable response, analyze variable importance, draw spatial dependence of variables and examine variable contribution. As utilities, the package includes a few functions to download bioclimatic variables including 'WorldClim' version 2.0 (see Fick, S.E. and Hijmans, R.J. (2017) <doi:10.1002/joc.5086>) and 'CMCC-BioClimInd' (see Noce, S., Caporaso, L. and Santini, M. (2020) <doi:10.1038/s41597-020-00726-5>.

Maintained by Lei Song. Last updated 2 years ago.

isolation-forest outlier-detection presence-onlymodel shapley-value species-distribution-modelling

8.0 match 4 stars 5.59 score 65 scripts

vsousa

poolHelper:Simulates Pooled Sequencing Genetic Data

Simulates pooled sequencing data under a variety of conditions. Also allows for the evaluation of the average absolute difference between allele frequencies computed from genotypes and those computed from pooled data. Carvalho et al., (2022) <doi:10.1101/2023.01.20.524733>.

Maintained by João Carvalho. Last updated 2 years ago.

10.7 match 4.18 score 3 scripts 1 dependents

rkoenker

quantreg:Quantile Regression

Estimation and inference methods for models for conditional quantile functions: Linear and nonlinear parametric and non-parametric (total variation penalized) models for conditional quantiles of a univariate response and several methods for handling censored survival data. Portfolio selection methods based on expected shortfall risk are also now included. See Koenker, R. (2005) Quantile Regression, Cambridge U. Press, <doi:10.1017/CBO9780511754098> and Koenker, R. et al. (2017) Handbook of Quantile Regression, CRC Press, <doi:10.1201/9781315120256>.

Maintained by Roger Koenker. Last updated 6 days ago.

fortran openblas

3.2 match 18 stars 13.93 score 2.6k scripts 1.5k dependents

centerforassessment

SGP:Student Growth Percentiles & Percentile Growth Trajectories

An analytic framework for the calculation of norm- and criterion-referenced academic growth estimates using large scale, longitudinal education assessment data as developed in Betebenner (2009) <doi:10.1111/j.1745-3992.2009.00161.x>.

Maintained by Damian W. Betebenner. Last updated 2 months ago.

percentile-growth-projections quantile-regression sgp sgp-analyses student-growth-percentiles student-growth-projections

4.5 match 20 stars 9.69 score 88 scripts

blasbenito

distantia:Advanced Toolset for Efficient Time Series Dissimilarity Analysis

Fast C++ implementation of Dynamic Time Warping for time series dissimilarity analysis, with applications in environmental monitoring and sensor data analysis, climate science, signal processing and pattern recognition, and financial data analysis. Built upon the ideas presented in Benito and Birks (2020) <doi:10.1111/ecog.04895>, provides tools for analyzing time series of varying lengths and structures, including irregular multivariate time series. Key features include individual variable contribution analysis, restricted permutation tests for statistical significance, and imputation of missing data via GAMs. Additionally, the package provides an ample set of tools to prepare and manage time series data.

Maintained by Blas M. Benito. Last updated 25 days ago.

7.2 match 23 stars 5.76 score 11 scripts

giscience

ohsome:An 'ohsome API' Client

A client that grants access to the power of the 'ohsome API' from R. It lets you analyze the rich data source of the 'OpenStreetMap (OSM)' history. You can retrieve the geometry of 'OSM' data at specific points in time, and you can get aggregated statistics on the evolution of 'OSM' elements and specify your own temporal, spatial and/or thematic filters.

Maintained by Oliver Fritz. Last updated 2 years ago.

heigit ohsome openstreetmap openstreetmap-data openstreetmap-history osm osm-data

8.2 match 11 stars 5.04 score 9 scripts

pharmaverse

datacutr:SDTM Datacut

Supports the process of applying a cut to Standard Data Tabulation Model (SDTM), as part of the analysis of specific points in time of the data, normally as part of investigation into clinical trials. The functions support different approaches of cutting to the different domains of SDTM normally observed.

Maintained by Tim Barnett. Last updated 1 months ago.

5.3 match 14 stars 7.48 score 11 scripts

gavinmdouglas

FuncDiv:Compute Contributional Diversity Metrics

Compute alpha and beta contributional diversity metrics, which is intended for linking taxonomic and functional microbiome data. See 'GitHub' repository for the tutorial: <https://github.com/gavinmdouglas/FuncDiv/wiki>. Citation: Gavin M. Douglas, Sunu Kim, Morgan G. I. Langille, B. Jesse Shapiro (2023) <doi:10.1093/bioinformatics/btac809>.

Maintained by Gavin Douglas. Last updated 2 years ago.

cpp

10.5 match 10 stars 3.70 score 1 scripts

inlabru-org

inlabru:Bayesian Latent Gaussian Modelling using INLA and Extensions

Facilitates spatial and general latent Gaussian modeling using integrated nested Laplace approximation via the INLA package (<https://www.r-inla.org>). Additionally, extends the GAM-like model class to more general nonlinear predictor expressions, and implements a log Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data. Model components are specified with general inputs and mapping methods to the latent variables, and the predictors are specified via general R expressions, with separate expressions for each observation likelihood model in multi-likelihood models. A prediction method based on fast Monte Carlo sampling allows posterior prediction of general expressions of the latent variables. Ecology-focused introduction in Bachl, Lindgren, Borchers, and Illian (2019) <doi:10.1111/2041-210X.13168>.

Maintained by Finn Lindgren. Last updated 3 days ago.

3.0 match 96 stars 12.62 score 832 scripts 6 dependents

ecospat

ecospat:Spatial Ecology Miscellaneous Methods

Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.

Maintained by Olivier Broennimann. Last updated 1 months ago.

4.0 match 32 stars 9.35 score 418 scripts 1 dependents

biodiverse

unmarked:Models for Data from Unmarked Animals

Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.

Maintained by Ken Kellner. Last updated 12 hours ago.

openblas cpp openmp

2.9 match 4 stars 13.03 score 652 scripts 12 dependents

rudeboybert

fivethirtyeight:Data and Code Behind the Stories and Interactives at 'FiveThirtyEight'

Datasets and code published by the data journalism website 'FiveThirtyEight' available at <https://github.com/fivethirtyeight/data>. Note that while we received guidance from editors at 'FiveThirtyEight', this package is not officially published by 'FiveThirtyEight'.

Maintained by Albert Y. Kim. Last updated 2 years ago.

data-science datajournalism fivethirtyeight statistics

3.3 match 453 stars 10.98 score 1.7k scripts

bioc

ChromSCape:Analysis of single-cell epigenomics datasets with a Shiny App

ChromSCape - Chromatin landscape profiling for Single Cells - is a ready-to-launch user-friendly Shiny Application for the analysis of single-cell epigenomics datasets (scChIP-seq, scATAC-seq, scCUT&Tag, ...) from aligned data to differential analysis & gene set enrichment analysis. It is highly interactive, enables users to save their analysis and covers a wide range of analytical steps: QC, preprocessing, filtering, batch correction, dimensionality reduction, vizualisation, clustering, differential analysis and gene set analysis.

Maintained by Pacome Prompsy. Last updated 5 months ago.

shinyapps software singlecell chipseq atacseq methylseq classification clustering epigenetics principalcomponent annotation batcheffect multiplecomparison normalization pathways preprocessing qualitycontrol reportwriting visualization genesetenrichment differentialpeakcalling epigenomics shiny single-cell cpp

6.2 match 14 stars 5.83 score 16 scripts

jcrodriguez1989

rco:The R Code Optimizer

Automatically apply different strategies to optimize R code. 'rco' functions take R code as input, and returns R code as output.

Maintained by Juan Cruz Rodriguez. Last updated 4 months ago.

compiler fast gcc hpc optimization optimizer

5.3 match 82 stars 6.73 score

dgerlanc

portfolio:Analysing Equity Portfolios

Classes for analysing and implementing equity portfolios, including routines for generating tradelists and calculating exposures to user-specified risk factors.

Maintained by Daniel Gerlanc. Last updated 7 months ago.

finance portfolio-construction risk-modelling

5.3 match 15 stars 6.68 score 106 scripts

bioc

limpca:An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods

This package has for objectives to provide a method to make Linear Models for high-dimensional designed data. limpca applies a GLM (General Linear Model) version of ASCA and APCA to analyse multivariate sample profiles generated by an experimental design. ASCA/APCA provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design and contrarily to MANOVA, it can deal with mutlivariate datasets having more variables than observations. This method can handle unbalanced design.

Maintained by Manon Martin. Last updated 5 months ago.

statisticalmethod principalcomponent regression visualization experimentaldesign multiplecomparison geneexpression metabolomics

6.0 match 2 stars 5.73 score 2 scripts

moderndive

moderndive:Tidyverse-Friendly Introductory Linear Regression

Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.

Maintained by Albert Y. Kim. Last updated 3 months ago.

3.0 match 88 stars 11.35 score 1.8k scripts

r-lib

lintr:A 'Linter' for R Code

Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.

Maintained by Michael Chirico. Last updated 8 days ago.

linter

2.0 match 1.2k stars 17.00 score 916 scripts 33 dependents

r-lidar

lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications

Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.

Maintained by Jean-Romain Roussel. Last updated 1 months ago.

als forestry las laz lidar point-cloud remote-sensing openblas cpp openmp

2.3 match 623 stars 14.47 score 844 scripts 8 dependents

marberts

gpindex:Generalized Price and Quantity Indexes

Tools to build and work with bilateral generalized-mean price indexes (and by extension quantity indexes), and indexes composed of generalized-mean indexes (e.g., superlative quadratic-mean indexes, GEKS). Covers the core mathematical machinery for making bilateral price indexes, computing price relatives, detecting outliers, and decomposing indexes, with wrappers for all common (and many uncommon) index-number formulas. Implements and extends many of the methods in Balk (2008, <doi:10.1017/CBO9780511720758>), von der Lippe (2007, <doi:10.3726/978-3-653-01120-3>), and the CPI manual (2020, <doi:10.5089/9781484354841.069>).

Maintained by Steve Martin. Last updated 2 days ago.

economics inflation official-statistics statistics

5.0 match 7 stars 6.63 score 29 scripts 1 dependents

mayer79

flashlight:Shed Light on Black Box Machine Learning Models

Shed light on black box machine learning models by the help of model performance, variable importance, global surrogate models, ICE profiles, partial dependence (Friedman J. H. (2001) <doi:10.1214/aos/1013203451>), accumulated local effects (Apley D. W. (2016) <arXiv:1612.08468>), further effects plots, interaction strength, and variable contribution breakdown (Gosiewska and Biecek (2019) <arxiv:1903.11420>). All tools are implemented to work with case weights and allow for stratified analysis. Furthermore, multiple flashlights can be combined and analyzed together.

Maintained by Michael Mayer. Last updated 8 months ago.

interpretability interpretable-machine-learning machine-learning xai

5.3 match 22 stars 6.25 score 54 scripts 1 dependents

adrientaudiere

MiscMetabar:Miscellaneous Functions for Metabarcoding Analysis

Facilitate the description, transformation, exploration, and reproducibility of metabarcoding analyses. 'MiscMetabar' is mainly built on top of the 'phyloseq', 'dada2' and 'targets' R packages. It helps to build reproducible and robust bioinformatics pipelines in R. 'MiscMetabar' makes ecological analysis of alpha and beta-diversity easier, more reproducible and more powerful by integrating a large number of tools. Important features are described in Taudière A. (2023) <doi:10.21105/joss.06038>.

Maintained by Adrien Taudière. Last updated 25 days ago.

sequencing microbiome metagenomics clustering classification visualization amplicon amplicon-sequencing biodiversity-informatics ecology illumina metabarcoding ngs-analysis

5.1 match 17 stars 6.44 score 23 scripts

dietrichson

ProPublicaR:Access Functions for ProPublica's APIs

Provides wrapper functions to access the ProPublica's Congress and Campaign Finance APIs. The Congress API provides near real-time access to legislative data from the House of Representatives, the Senate and the Library of Congress. The Campaign Finance API provides data from United States Federal Election Commission filings and other sources. The API covers summary information for candidates and committees, as well as certain types of itemized data. For more information about these APIs go to: <https://www.propublica.org/datastore/apis>.

Maintained by Aleksander Dietrichson. Last updated 2 years ago.

7.4 match 12 stars 4.38 score 1 scripts

nsaph-software

CRE:Interpretable Discovery and Inference of Heterogeneous Treatment Effects

Provides a new method for interpretable heterogeneous treatment effects characterization in terms of decision rules via an extensive exploration of heterogeneity patterns by an ensemble-of-trees approach, enforcing high stability in the discovery. It relies on a two-stage pseudo-outcome regression, and it is supported by theoretical convergence guarantees. Bargagli-Stoffi, F. J., Cadei, R., Lee, K., & Dominici, F. (2023) Causal rule ensemble: Interpretable Discovery and Inference of Heterogeneous Treatment Effects. arXiv preprint <doi:10.48550/arXiv.2009.09036>.

Maintained by Falco Joannes Bargagli Stoffi. Last updated 5 months ago.

5.0 match 13 stars 6.41 score 11 scripts

ropensci

rix:Reproducible Data Science Environments with 'Nix'

Simplifies the creation of reproducible data science environments using the 'Nix' package manager, as described in Dolstra (2006) <ISBN 90-393-4130-3>. The included `rix()` function generates a complete description of the environment as a `default.nix` file, which can then be built using 'Nix'. This results in project specific software environments with pinned versions of R, packages, linked system dependencies, and other tools. Additional helpers make it easy to run R code in 'Nix' software environments for testing and production.

Maintained by Bruno Rodrigues. Last updated 4 days ago.

nix peer-reviewed reproducibility reproducible-research

3.0 match 235 stars 10.54 score 67 scripts

gabrielodom

mvMonitoring:Multi-State Adaptive Dynamic Principal Component Analysis for Multivariate Process Monitoring

Use multi-state splitting to apply Adaptive-Dynamic PCA (ADPCA) to data generated from a continuous-time multivariate industrial or natural process. Employ PCA-based dimension reduction to extract linear combinations of relevant features, reducing computational burdens. For a description of ADPCA, see <doi:10.1007/s00477-016-1246-2>, the 2016 paper from Kazor et al. The multi-state application of ADPCA is from a manuscript under current revision entitled "Multi-State Multivariate Statistical Process Control" by Odom, Newhart, Cath, and Hering, and is expected to appear in Q1 of 2018.

Maintained by Gabriel Odom. Last updated 1 years ago.

5.9 match 4 stars 5.24 score 29 scripts

rstudio

tfprobability:Interface to 'TensorFlow Probability'

Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

3.5 match 54 stars 8.63 score 221 scripts 3 dependents

jameslamb

lightgbm:Light Gradient Boosting Machine

Tree based algorithms can be improved by introducing boosting frameworks. 'LightGBM' is one such framework, based on Ke, Guolin et al. (2017) <https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision>. This package offers an R interface to work with it. It is designed to be distributed and efficient with the following advantages: 1. Faster training speed and higher efficiency. 2. Lower memory usage. 3. Better accuracy. 4. Parallel learning supported. 5. Capable of handling large-scale data. In recognition of these advantages, 'LightGBM' has been widely-used in many winning solutions of machine learning competitions. Comparison experiments on public datasets suggest that 'LightGBM' can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. In addition, parallel experiments suggest that in certain circumstances, 'LightGBM' can achieve a linear speed-up in training time by using multiple machines.

Maintained by James Lamb. Last updated 30 days ago.

cpp openmp

3.7 match 1 stars 8.30 score 1.6k scripts 6 dependents

r-lib

vctrs:Vector Helpers

Defines new notions of prototype and size that are used to provide tools for consistent and well-founded type-coercion and size-recycling, and are in turn connected to ideas of type- and size-stability useful for analysing function interfaces.

Maintained by Davis Vaughan. Last updated 5 months ago.

s3-vectors

1.6 match 290 stars 18.97 score 1.1k scripts 13k dependents

bioc

GSVA:Gene Set Variation Analysis for Microarray and RNA-Seq Data

Gene Set Variation Analysis (GSVA) is a non-parametric, unsupervised method for estimating variation of gene set enrichment through the samples of a expression data set. GSVA performs a change in coordinate systems, transforming the data from a gene by sample matrix to a gene-set by sample matrix, thereby allowing the evaluation of pathway enrichment for each sample. This new matrix of GSVA enrichment scores facilitates applying standard analytical methods like functional enrichment, survival analysis, clustering, CNV-pathway analysis or cross-tissue pathway analysis, in a pathway-centric manner.

Maintained by Robert Castelo. Last updated 4 days ago.

functionalgenomics microarray rnaseq pathways genesetenrichment gene-set-enrichment genomics pathway-enrichment-analysis

2.0 match 210 stars 14.72 score 1.6k scripts 19 dependents

bioc

iSEEhub:iSEE for the Bioconductor ExperimentHub

This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

dataimport immunooncology infrastructure shinyapps singlecell software bioconductor bioconductor-package hacktoberfest isee

5.3 match 3 stars 5.56 score 4 scripts

samhforbes

eyetrackingR:Eye-Tracking Data Analysis

Addresses tasks along the pipeline from raw data to analysis and visualization for eye-tracking data. Offers several popular types of analyses, including linear and growth curve time analyses, onset-contingent reaction time analyses, as well as several non-parametric bootstrapping approaches. For references to the approach see Mirman, Dixon & Magnuson (2008) <doi:10.1016/j.jml.2007.11.006>, and Barr (2008) <doi:10.1016/j.jml.2007.09.002>.

Maintained by Samuel Forbes. Last updated 1 years ago.

3.6 match 22 stars 7.84 score 60 scripts

svkucheryavski

mdatools:Multivariate Data Analysis for Chemometrics

Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.

Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.

3.9 match 35 stars 7.37 score 220 scripts 1 dependents

quantsulting

ghyp:Generalized Hyperbolic Distribution and Its Special Cases

Detailed functionality for working with the univariate and multivariate Generalized Hyperbolic distribution and its special cases (Hyperbolic (hyp), Normal Inverse Gaussian (NIG), Variance Gamma (VG), skewed Student-t and Gaussian distribution). Especially, it contains fitting procedures, an AIC-based model selection routine, and functions for the computation of density, quantile, probability, random variates, expected shortfall and some portfolio optimization and plotting routines as well as the likelihood ratio test. In addition, it contains the Generalized Inverse Gaussian distribution. See Chapter 3 of A. J. McNeil, R. Frey, and P. Embrechts. Quantitative risk management: Concepts, techniques and tools. Princeton University Press, Princeton (2005).

Maintained by Marc Weibel. Last updated 7 months ago.

5.0 match 5.58 score 90 scripts 8 dependents

harrelfe

Hmisc:Harrell Miscellaneous

Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.

Maintained by Frank E Harrell Jr. Last updated 2 days ago.

fortran

1.6 match 210 stars 17.61 score 17k scripts 750 dependents

kof-ch

tstools:A Time Series Toolbox for Official Statistics

Plot official statistics' time series conveniently: automatic legends, highlight windows, stacked bar chars with positive and negative contributions, sum-as-line option, two y-axes with automatic horizontal grids that fit both axes and other popular chart types. 'tstools' comes with a plethora of defaults to let you plot without setting an abundance of parameters first, but gives you the flexibility to tweak the defaults. In addition to charts, 'tstools' provides a super fast, 'data.table' backed time series I/O that allows the user to export / import long format, wide format and transposed wide format data to various file types.

Maintained by Stéphane Bisinger. Last updated 1 years ago.

4.3 match 11 stars 6.47 score 177 scripts

guido-s

netmeta:Network Meta-Analysis using Frequentist Methods

A comprehensive set of functions providing frequentist methods for network meta-analysis (Balduzzi et al., 2023) <doi:10.18637/jss.v106.i02> and supporting Schwarzer et al. (2015) <doi:10.1007/978-3-319-21416-0>, Chapter 8 "Network Meta-Analysis": - frequentist network meta-analysis following Rücker (2012) <doi:10.1002/jrsm.1058>; - additive network meta-analysis for combinations of treatments (Rücker et al., 2020) <doi:10.1002/bimj.201800167>; - network meta-analysis of binary data using the Mantel-Haenszel or non-central hypergeometric distribution method (Efthimiou et al., 2019) <doi:10.1002/sim.8158>, or penalised logistic regression (Evrenoglou et al., 2022) <doi:10.1002/sim.9562>; - rankograms and ranking of treatments by the Surface under the cumulative ranking curve (SUCRA) (Salanti et al., 2013) <doi:10.1016/j.jclinepi.2010.03.016>; - ranking of treatments using P-scores (frequentist analogue of SUCRAs without resampling) according to Rücker & Schwarzer (2015) <doi:10.1186/s12874-015-0060-8>; - split direct and indirect evidence to check consistency (Dias et al., 2010) <doi:10.1002/sim.3767>, (Efthimiou et al., 2019) <doi:10.1002/sim.8158>; - league table with network meta-analysis results; - 'comparison-adjusted' funnel plot (Chaimani & Salanti, 2012) <doi:10.1002/jrsm.57>; - net heat plot and design-based decomposition of Cochran's Q according to Krahn et al. (2013) <doi:10.1186/1471-2288-13-35>; - measures characterizing the flow of evidence between two treatments by König et al. (2013) <doi:10.1002/sim.6001>; - automated drawing of network graphs described in Rücker & Schwarzer (2016) <doi:10.1002/jrsm.1143>; - partial order of treatment rankings ('poset') and Hasse diagram for 'poset' (Carlsen & Bruggemann, 2014) <doi:10.1002/cem.2569>; (Rücker & Schwarzer, 2017) <doi:10.1002/jrsm.1270>; - contribution matrix as described in Papakonstantinou et al. (2018) <doi:10.12688/f1000research.14770.3> and Davies et al. (2022) <doi:10.1002/sim.9346>; - subgroup network meta-analysis.

Maintained by Guido Schwarzer. Last updated 1 days ago.

meta-analysis network-meta-analysis rstudio

2.3 match 33 stars 11.82 score 199 scripts 10 dependents

frbcesab

rcompendium:Create a Package or Research Compendium Structure

Makes easier the creation of R package or research compendium (i.e. a predefined files/folders structure) so that users can focus on the code/analysis instead of wasting time organizing files. A full ready-to-work structure is set up with some additional features: version control, remote repository creation, CI/CD configuration (check package integrity under several OS, test code with 'testthat', and build and deploy website using 'pkgdown'). This package heavily relies on the R packages 'devtools' and 'usethis' and follows recommendations made by Wickham H. (2015) <ISBN:9781491910597> and Marwick B. et al. (2018) <doi:10.7287/peerj.preprints.3192v2>.

Maintained by Nicolas Casajus. Last updated 1 months ago.

reproducible-research research-compendium

4.0 match 40 stars 6.72 score 22 scripts

safetygraphics

safetyGraphics:Interactive Graphics for Monitoring Clinical Trial Safety

A framework for evaluation of clinical trial safety. Users can interactively explore their data using the included 'Shiny' application.

Maintained by Jeremy Wildfire. Last updated 2 years ago.

3.2 match 98 stars 8.18 score 111 scripts

adeverse

ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences

Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.

Maintained by Aurélie Siberchicot. Last updated 12 days ago.

openblas cpp

1.8 match 39 stars 14.96 score 2.2k scripts 256 dependents

epiverse-trace

cleanepi:Clean and Standardize Epidemiological Data

Cleaning and standardizing tabular data package, tailored specifically for curating epidemiological data. It streamlines various data cleaning tasks that are typically expected when working with datasets in epidemiology. It returns the processed data in the same format, and generates a comprehensive report detailing the outcomes of each cleaning task.

Maintained by Karim Mané. Last updated 2 days ago.

data-cleaning epidemiology epiverse

3.5 match 9 stars 7.44 score 19 scripts

harrison4192

autostats:Auto Stats

Automatically do statistical exploration. Create formulas using 'tidyselect' syntax, and then determine cross-validated model accuracy and variable contributions using 'glm' and 'xgboost'. Contains additional helper functions to create and modify formulas. Has a flagship function to quickly determine relationships between categorical and continuous variables in the data set.

Maintained by Harrison Tietze. Last updated 11 days ago.

3.8 match 6 stars 6.76 score 5 scripts 2 dependents

svmiller

stevemisc:Steve's Miscellaneous Functions

These are miscellaneous functions that I find useful for my research and teaching. The contents include themes for plots, functions for simulating quantities of interest from regression models, functions for simulating various forms of fake data for instructional/research purposes, and many more. All told, the functions provided here are broadly useful for data organization, data presentation, data recoding, and data simulation.

Maintained by Steve Miller. Last updated 5 days ago.

dplyr mixed-effects-models multivariate-normal-distribution tidyverse

3.8 match 10 stars 6.85 score 392 scripts 2 dependents

kassambara

factoextra:Extract and Visualize the Results of Multivariate Data Analyses

Provides some easy-to-use functions to extract and visualize the output of multivariate data analyses, including 'PCA' (Principal Component Analysis), 'CA' (Correspondence Analysis), 'MCA' (Multiple Correspondence Analysis), 'FAMD' (Factor Analysis of Mixed Data), 'MFA' (Multiple Factor Analysis) and 'HMFA' (Hierarchical Multiple Factor Analysis) functions from different R packages. It contains also functions for simplifying some clustering analysis steps and provides 'ggplot2' - based elegant data visualization.

Maintained by Alboukadel Kassambara. Last updated 5 years ago.

1.8 match 363 stars 14.13 score 15k scripts 52 dependents

mahshaaban

pcr:Analyzing Real-Time Quantitative PCR Data

Calculates the amplification efficiency and curves from real-time quantitative PCR (Polymerase Chain Reaction) data. Estimates the relative expression from PCR data using the double delta CT and the standard curve methods Livak & Schmittgen (2001) <doi:10.1006/meth.2001.1262>. Tests for statistical significance using two-group tests and linear regression Yuan et al. (2006) <doi: 10.1186/1471-2105-7-85>.

Maintained by Mahmoud Ahmed. Last updated 8 months ago.

data-analyses molecular-biology qpcr

3.5 match 28 stars 7.25 score 63 scripts

data-cleaning

validate:Data Validation Infrastructure

Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.

Maintained by Mark van der Loo. Last updated 11 days ago.

data-cleaning validation

2.0 match 418 stars 12.50 score 448 scripts 9 dependents

pavel-fibich

gawdis:Multi-Trait Dissimilarity with more Uniform Contributions

R function gawdis() produces multi-trait dissimilarity with more uniform contributions of different traits. de Bello et al. (2021) <doi:10.1111/2041-210X.13537> presented the approach based on minimizing the differences in the correlation between the dissimilarity of each trait, or groups of traits, and the multi-trait dissimilarity. This is done using either an analytic or a numerical solution, both available in the function.

Maintained by Pavel Fibich. Last updated 2 years ago.

4.8 match 5 stars 5.20 score 21 scripts 1 dependents

rishvish

DImodelsVis:Visualising and Interpreting Statistical Models Fit to Compositional Data

Statistical models fit to compositional data are often difficult to interpret due to the sum to 1 constraint on data variables. 'DImodelsVis' provides novel visualisations tools to aid with the interpretation of models fit to compositional data. All visualisations in the package are created using the 'ggplot2' plotting framework and can be extended like every other 'ggplot' object.

Maintained by Rishabh Vishwakarma. Last updated 6 months ago.

6.7 match 3.70 score 7 scripts

bioc

maftools:Summarize, Analyze and Visualize MAF Files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

Maintained by Anand Mayakonda. Last updated 5 months ago.

datarepresentation dnaseq visualization drivermutation variantannotation featureextraction classification somaticmutation sequencing functionalgenomics survival bioinformatics cancer-genome-atlas cancer-genomics genomics maf-files tcga curl bzip2 xz-utils zlib

1.7 match 459 stars 14.63 score 948 scripts 18 dependents

ropensci

stplanr:Sustainable Transport Planning

Tools for transport planning with an emphasis on spatial transport data and non-motorized modes. The package was originally developed to support the 'Propensity to Cycle Tool', a publicly available strategic cycle network planning tool (Lovelace et al. 2017) <doi:10.5198/jtlu.2016.862>, but has since been extended to support public transport routing and accessibility analysis (Moreno-Monroy et al. 2017) <doi:10.1016/j.jtrangeo.2017.08.012> and routing with locally hosted routing engines such as 'OSRM' (Lowans et al. 2023) <doi:10.1016/j.enconman.2023.117337>. The main functions are for creating and manipulating geographic "desire lines" from origin-destination (OD) data (building on the 'od' package); calculating routes on the transport network locally and via interfaces to routing services such as <https://cyclestreets.net/> (Desjardins et al. 2021) <doi:10.1007/s11116-021-10197-1>; and calculating route segment attributes such as bearing. The package implements the 'travel flow aggregration' method described in Morgan and Lovelace (2020) <doi:10.1177/2399808320942779> and the 'OD jittering' method described in Lovelace et al. (2022) <doi:10.32866/001c.33873>. Further information on the package's aim and scope can be found in the vignettes and in a paper in the R Journal (Lovelace and Ellison 2018) <doi:10.32614/RJ-2018-053>, and in a paper outlining the landscape of open source software for geographic methods in transport planning (Lovelace, 2021) <doi:10.1007/s10109-020-00342-2>.

Maintained by Robin Lovelace. Last updated 7 months ago.

cycle cycling desire-lines origin-destination peer-reviewed pubic-transport route-network routes routing spatial transport transport-planning transportation walking

2.0 match 427 stars 12.31 score 684 scripts 3 dependents

zabore

riskclustr:Functions to Study Etiologic Heterogeneity

A collection of functions related to the study of etiologic heterogeneity both across disease subtypes and across individual disease markers. The included functions allow one to quantify the extent of etiologic heterogeneity in the context of a case-control study, and provide p-values to test for etiologic heterogeneity across individual risk factors. Begg CB, Zabor EC, Bernstein JL, Bernstein L, Press MF, Seshan VE (2013) <doi:10.1002/sim.5902>.

Maintained by Emily C. Zabor. Last updated 1 years ago.

5.1 match 1 stars 4.81 score 26 scripts

davidfirth

relimp:Relative Contribution of Effects in a Regression Model

Functions to facilitate inference on the relative importance of predictors in a linear or generalized linear model, and a couple of useful Tcl/Tk widgets.

Maintained by David Firth. Last updated 9 years ago.

4.7 match 1 stars 5.11 score 42 scripts 61 dependents

causalinference

gfoRmula:Parametric G-Formula

Implements the non-iterative conditional expectation (NICE) algorithm of the g-formula algorithm (Robins (1986) <doi:10.1016/0270-0255(86)90088-6>, Hernán and Robins (2024, ISBN:9781420076165)). The g-formula can estimate an outcome's counterfactual mean or risk under hypothetical treatment strategies (interventions) when there is sufficient information on time-varying treatments and confounders. This package can be used for discrete or continuous time-varying treatments and for failure time outcomes or continuous/binary end of follow-up outcomes. The package can handle a random measurement/visit process and a priori knowledge of the data structure, as well as censoring (e.g., by loss to follow-up) and two options for handling competing events for failure time outcomes. Interventions can be flexibly specified, both as interventions on a single treatment or as joint interventions on multiple treatments. See McGrath et al. (2020) <doi:10.1016/j.patter.2020.100008> for a guide on how to use the package.

Maintained by Sean McGrath. Last updated 26 days ago.

2.9 match 165 stars 8.18 score 132 scripts

statisticsnorway

GaussSuppression:Tabular Data Suppression using Gaussian Elimination

A statistical disclosure control tool to protect tables by suppression using the Gaussian elimination secondary suppression algorithm (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6>. A suggestion is to start by working with functions SuppressSmallCounts() and SuppressDominantCells(). These functions use primary suppression functions for the minimum frequency rule and the dominance rule, respectively. Novel functionality for suppression of disclosive cells is also included. General primary suppression functions can be supplied as input to the general working horse function, GaussSuppressionFromData(). Suppressed frequencies can be replaced by synthetic decimal numbers as described in Langsrud (2019) <doi:10.1007/s11222-018-9848-9>.

Maintained by Øyvind Langsrud. Last updated 2 days ago.

3.6 match 2 stars 6.61 score 50 scripts

sbgraves237

sos:Search Contributed R Packages, Sort by Package

Search contributed R packages, sort by package.

Maintained by Spencer Graves. Last updated 9 months ago.

3.5 match 2 stars 6.82 score 241 scripts 3 dependents

hughparsonage

grattan:Australian Tax Policy Analysis

Utilities to cost and evaluate Australian tax policy, including fast projections of personal income tax collections, high-performance tax and transfer calculators, and an interface to common indices from the Australian Bureau of Statistics. Written to support Grattan Institute's Australian Perspectives program, and related projects. Access to the Australian Taxation Office's sample files of personal income tax returns is assumed.

Maintained by Hugh Parsonage. Last updated 12 months ago.

australian-analysts tax openmp

3.8 match 25 stars 6.34 score 124 scripts

kjakobse

EpiForsk:Code Sharing at the Department of Epidemiological Research at Statens Serum Institut

This is a collection of assorted functions and examples collected from various projects. Currently we have functionalities for simplifying overlapping time intervals, Charlson comorbidity score constructors for Danish data, getting frequency for multiple variables, getting standardized output from logistic and log-linear regressions, sibling design linear regression functionalities a method for calculating the confidence intervals for functions of parameters from a GLM, Bayes equivalent for hypothesis testing with asymptotic Bayes factor, and several help functions for generalized random forest analysis using 'grf'.

Maintained by Kim Daniel Jakobsen. Last updated 1 years ago.

5.3 match 4.48 score 8 scripts

datalorax

equatiomatic:Transform Models into 'LaTeX' Equations

The goal of 'equatiomatic' is to reduce the pain associated with writing 'LaTeX' formulas from fitted models. The primary function of the package, extract_eq(), takes a fitted model object as its input and returns the corresponding 'LaTeX' code for the model.

Maintained by Philippe Grosjean. Last updated 6 days ago.

2.0 match 619 stars 11.75 score 424 scripts 5 dependents

biorgeo

bioregion:Comparison of Bioregionalisation Methods

The main purpose of this package is to propose a transparent methodological framework to compare bioregionalisation methods based on hierarchical and non-hierarchical clustering algorithms (Kreft & Jetz (2010) <doi:10.1111/j.1365-2699.2010.02375.x>) and network algorithms (Lenormand et al. (2019) <doi:10.1002/ece3.4718> and Leroy et al. (2019) <doi:10.1111/jbi.13674>).

Maintained by Maxime Lenormand. Last updated 10 days ago.

biogeography bioregion bioregionalization cpp

3.7 match 7 stars 6.27 score 11 scripts

hugaped

MBNMAdose:Dose-Response MBNMA Models

Fits Bayesian dose-response model-based network meta-analysis (MBNMA) that incorporate multiple doses within an agent by modelling different dose-response functions, as described by Mawdsley et al. (2016) <doi:10.1002/psp4.12091>. By modelling dose-response relationships this can connect networks of evidence that might otherwise be disconnected, and can improve precision on treatment estimates. Several common dose-response functions are provided; others may be added by the user. Various characteristics and assumptions can be flexibly added to the models, such as shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting at the treatment level.

Maintained by Hugo Pedder. Last updated 1 months ago.

jags cpp

3.5 match 10 stars 6.60 score

loukiaspin

rnmamod:Bayesian Network Meta-Analysis with Missing Participants

A comprehensive suite of functions to perform and visualise pairwise and network meta-analysis with aggregate binary or continuous missing participant outcome data. The package covers core Bayesian one-stage models implemented in a systematic review with multiple interventions, including fixed-effect and random-effects network meta-analysis, meta-regression, evaluation of the consistency assumption via the node-splitting approach and the unrelated mean effects model (original and revised model proposed by Spineli, (2022) <doi:10.1177/0272989X211068005>), and sensitivity analysis (see Spineli et al., (2021) <doi:10.1186/s12916-021-02195-y>). Missing participant outcome data are addressed in all models of the package (see Spineli, (2019) <doi:10.1186/s12874-019-0731-y>, Spineli et al., (2019) <doi:10.1002/sim.8207>, Spineli, (2019) <doi:10.1016/j.jclinepi.2018.09.002>, and Spineli et al., (2021) <doi:10.1002/jrsm.1478>). The robustness to primary analysis results can also be investigated using a novel intuitive index (see Spineli et al., (2021) <doi:10.1177/0962280220983544>). Methods to evaluate the transitivity assumption quantitatively are provided (see Spineli, (2024) <doi:10.1186/s12874-024-02436-7>). A novel index to facilitate interpretation of local inconsistency is also available (see Spineli, (2024) <doi:0.1186/s13643-024-02680-4>) The package also offers a rich, user-friendly visualisation toolkit that aids in appraising and interpreting the results thoroughly and preparing the manuscript for journal submission. The visualisation tools comprise the network plot, forest plots, panel of diagnostic plots, heatmaps on the extent of missing participant outcome data in the network, league heatmaps on estimation and prediction, rankograms, Bland-Altman plot, leverage plot, deviance scatterplot, heatmap of robustness, barplot of Kullback-Leibler divergence, heatmap of comparison dissimilarities and dendrogram of comparison clustering. The package also allows the user to export the results to an Excel file at the working directory.

Maintained by Loukia Spineli. Last updated 9 days ago.

jags cpp

3.5 match 5 stars 6.64 score 12 scripts

chjackson

flexsurv:Flexible Parametric Survival and Multi-State Models

Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.

Maintained by Christopher Jackson. Last updated 2 months ago.

cpp

1.7 match 57 stars 13.31 score 632 scripts 43 dependents

ropensci

weatherOz:An API Client for Australian Weather and Climate Data Resources

Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.

Maintained by Rodrigo Pires. Last updated 19 days ago.

dpird bom meteorological-data weather-forecast australia weather weather-data meteorology western-australia australia-bureau-of-meteorology western-australia-agriculture australia-agriculture australia-climate australia-weather api-client climate data rainfall weather-api

2.7 match 32 stars 8.54 score 40 scripts

ropensci

nasapower:NASA POWER API Client

An API client for NASA POWER global meteorology, surface solar energy and climatology data API. POWER (Prediction Of Worldwide Energy Resources) data are freely available for download with varying spatial resolutions dependent on the original data and with several temporal resolutions depending on the POWER parameter and community. This work is funded through the NASA Earth Science Directorate Applied Science Program. For more on the data themselves, the methodologies used in creating, a web- based data viewer and web access, please see <https://power.larc.nasa.gov/>.

Maintained by Adam H. Sparks. Last updated 10 days ago.

nasa meteorological-data weather global weather-data meteorology nasa-power agroclimatology earth-science data-access climate-data agroclimatology-data weather-variables

2.3 match 101 stars 9.98 score 137 scripts 3 dependents

tidyverse

duckplyr:A 'DuckDB'-Backed Version of 'dplyr'

A drop-in replacement for 'dplyr', powered by 'DuckDB' for performance. Offers convenient utilities for working with in-memory and larger-than-memory data while retaining full 'dplyr' compatibility.

Maintained by Kirill Müller. Last updated 4 days ago.

analytics dataframe dplyr duckdb performance

2.0 match 309 stars 11.33 score 220 scripts

bioc

ensembldb:Utilities to create and use Ensembl-based annotation databases

The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.

Maintained by Johannes Rainer. Last updated 5 months ago.

genetics annotationdata sequencing coverage annotation bioconductor bioconductor-packages ensembl

1.6 match 35 stars 14.08 score 892 scripts 108 dependents

jonathan-g

kayadata:Kaya Identity Data for Nations and Regions

Provides data for Kaya identity variables (population, gross domestic product, primary energy consumption, and energy-related CO2 emissions) for the world and for individual nations, and utility functions for looking up data, plotting trends of Kaya variables, and plotting the fuel mix for a given country or region. The Kaya identity (Yoichi Kaya and Keiichi Yokobori, "Environment, Energy, and Economy: Strategies for Sustainability" (United Nations University Press, 1998) and <https://en.wikipedia.org/wiki/Kaya_identity>) expresses a nation's or region's greenhouse gas emissions in terms of its population, per-capita Gross Domestic Product, the energy intensity of its economy, and the carbon-intensity of its energy supply.

Maintained by Jonathan Gilligan. Last updated 8 months ago.

4.5 match 4.98 score 32 scripts

haghish

shapley:Weighted Mean SHAP and CI for Robust Feature Selection in ML Grid

This R package introduces Weighted Mean SHapley Additive exPlanations (WMSHAP), an innovative method for calculating SHAP values for a grid of fine-tuned base-learner machine learning models as well as stacked ensembles, a method not previously available due to the common reliance on single best-performing models. By integrating the weighted mean SHAP values from individual base-learners comprising the ensemble or individual base-learners in a tuning grid search, the package weights SHAP contributions according to each model's performance, assessed by multiple either R squared (for both regression and classification models). alternatively, this software also offers weighting SHAP values based on the area under the precision-recall curve (AUCPR), the area under the curve (AUC), and F2 measures for binary classifiers. It further extends this framework to implement weighted confidence intervals for weighted mean SHAP values, offering a more comprehensive and robust feature importance evaluation over a grid of machine learning models, instead of solely computing SHAP values for the best model. This methodology is particularly beneficial for addressing the severe class imbalance (class rarity) problem by providing a transparent, generalized measure of feature importance that mitigates the risk of reporting SHAP values for an overfitted or biased model and maintains robustness under severe class imbalance, where there is no universal criteria of identifying the absolute best model. Furthermore, the package implements hypothesis testing to ascertain the statistical significance of SHAP values for individual features, as well as comparative significance testing of SHAP contributions between features. Additionally, it tackles a critical gap in feature selection literature by presenting criteria for the automatic feature selection of the most important features across a grid of models or stacked ensembles, eliminating the need for arbitrary determination of the number of top features to be extracted. This utility is invaluable for researchers analyzing feature significance, particularly within severely imbalanced outcomes where conventional methods fall short. Moreover, it is also expected to report democratic feature importance across a grid of models, resulting in a more comprehensive and generalizable feature selection. The package further implements a novel method for visualizing SHAP values both at subject level and feature level as well as a plot for feature selection based on the weighted mean SHAP ratios.

Maintained by E. F. Haghish. Last updated 2 days ago.

class-imbalance class-imbalance-problem feature-extraction feature-importance feature-selection machine-learning machine-learning-algorithms shap shap-analysis shap-values shapely shapley-additive-explanations shapley-decomposition shapley-value shapley-values shapleyvalue weighted-shap weighted-shap-confidence-interval weighted-shapley weighted-shapley-ci

4.3 match 14 stars 5.19 score 17 scripts

flyaflya

causact:Fast, Easy, and Visual Bayesian Inference

Accelerate Bayesian analytics workflows in 'R' through interactive modelling, visualization, and inference. Define probabilistic graphical models using directed acyclic graphs (DAGs) as a unifying language for business stakeholders, statisticians, and programmers. This package relies on interfacing with the 'numpyro' python package.

Maintained by Adam Fleischhacker. Last updated 2 months ago.

bayesian-inference dags posterior-probability probabilistic-graphical-models probabilistic-programming

3.1 match 45 stars 7.15 score 52 scripts

bioc

phyloseq:Handling and analysis of high-throughput microbiome census data

phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.

Maintained by Paul J. McMurdie. Last updated 5 months ago.

immunooncology sequencing microbiome metagenomics clustering classification multiplecomparison geneticvariability

1.6 match 597 stars 13.90 score 8.4k scripts 37 dependents

sachsmc

eventglm:Regression Models for Event History Outcomes

A user friendly, easy to understand way of doing event history regression for marginal estimands of interest, including the cumulative incidence and the restricted mean survival, using the pseudo observation framework for estimation. For a review of the methodology, see Andersen and Pohar Perme (2010) <doi:10.1177/0962280209105020> or Sachs and Gabriel (2022) <doi:10.18637/jss.v102.i09>. The interface uses the well known formulation of a generalized linear model and allows for features including plotting of residuals, the use of sampling weights, and corrected variance estimation.

Maintained by Michael C Sachs. Last updated 15 days ago.

3.4 match 5 stars 6.33 score 24 scripts 1 dependents

neuropsychology

psycho:Efficient and Publishing-Oriented Workflow for Psychological Science

The main goal of the psycho package is to provide tools for psychologists, neuropsychologists and neuroscientists, to facilitate and speed up the time spent on data analysis. It aims at supporting best practices and tools to format the output of statistical methods to directly paste them into a manuscript, ensuring statistical reporting standardization and conformity.

Maintained by Dominique Makowski. Last updated 4 years ago.

apa apa6 bayesian correlation format interpretation mixed-models neuroscience psycho psychology rstanarm statistics

2.0 match 149 stars 10.86 score 628 scripts 5 dependents

larmarange

broom.helpers:Helpers for Model Coefficients Tibbles

Provides suite of functions to work with regression model 'broom::tidy()' tibbles. The suite includes functions to group regression model terms by variable, insert reference and header rows for categorical variables, add variable labels, and more.

Maintained by Joseph Larmarange. Last updated 9 days ago.

1.9 match 22 stars 11.45 score 165 scripts 2 dependents

hugaped

MBNMAtime:Run Time-Course Model-Based Network Meta-Analysis (MBNMA) Models

Fits Bayesian time-course models for model-based network meta-analysis (MBNMA) that allows inclusion of multiple time-points from studies. Repeated measures over time are accounted for within studies by applying different time-course functions, following the method of Pedder et al. (2019) <doi:10.1002/jrsm.1351>. The method allows synthesis of studies with multiple follow-up measurements that can account for time-course for a single or multiple treatment comparisons. Several general time-course functions are provided; others may be added by the user. Various characteristics can be flexibly added to the models, such as correlation between time points and shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting.

Maintained by Hugo Pedder. Last updated 1 months ago.

jags cpp

3.5 match 7 stars 6.10 score

ndphillips

yarrr:A Companion to the e-Book "YaRrr!: The Pirate's Guide to R"

Contains a mixture of functions and data sets referred to in the introductory e-book "YaRrr!: The Pirate's Guide to R". The latest version of the e-book is available for free at <https://www.thepiratesguidetor.com>.

Maintained by Nathaniel Phillips. Last updated 11 months ago.

2.0 match 78 stars 10.67 score 1.2k scripts 2 dependents

asl

Rssa:A Collection of Methods for Singular Spectrum Analysis

Methods and tools for Singular Spectrum Analysis including decomposition, forecasting and gap-filling for univariate and multivariate time series. General description of the methods with many examples can be found in the book Golyandina (2018, <doi:10.1007/978-3-662-57380-8>). See 'citation("Rssa")' for details.

Maintained by Anton Korobeynikov. Last updated 6 months ago.

fftw3

3.0 match 58 stars 7.10 score 182 scripts 4 dependents

qsbase

qs:Quick Serialization of R Objects

Provides functions for quickly writing and reading any R object to and from disk.

Maintained by Travers Ching. Last updated 9 days ago.

compression data-storage encoding serialization libzstd lz4 cpp

1.5 match 414 stars 13.91 score 2.5k scripts 51 dependents

bioc

MsCoreUtils:Core Utils for Mass Spectrometry Data

MsCoreUtils defines low-level functions for mass spectrometry data and is independent of any high-level data structures. These functions include mass spectra processing functions (noise estimation, smoothing, binning, baseline estimation), quantitative aggregation functions (median polish, robust summarisation, ...), missing data imputation, data normalisation (quantiles, vsn, ...), misc helper functions, that are used across high-level data structure within the R for Mass Spectrometry packages.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 4 days ago.

infrastructure proteomics massspectrometry metabolomics bioconductor mass-spectrometry utils

2.0 match 16 stars 10.52 score 41 scripts 71 dependents

r-forge

distr:Object Oriented Implementation of Distributions

S4-classes and methods for distributions.

Maintained by Peter Ruckdeschel. Last updated 2 months ago.

2.4 match 8.84 score 327 scripts 32 dependents

uclouvain-cbio

scpdata:Single-Cell Proteomics Data Package

The package disseminates mass spectrometry (MS)-based single-cell proteomics (SCP) datasets. The data were collected from published work and formatted using the `scp` data structure. The data sets contain quantitative information at spectrum, peptide and/or protein level for single cells or minute sample amounts.

Maintained by Christophe Vanderaa. Last updated 9 days ago.

experimentdata expressiondata experimenthub reproducibleresearch massspectrometrydata proteome singlecelldata packagetypedata

3.8 match 6 stars 5.58 score 16 scripts

eurostat

hicp:Harmonised Index of Consumer Prices

The Harmonised Index of Consumer Prices (HICP) is the key economic figure to measure inflation in the euro area. The methodology underlying the HICP is documented in the HICP Methodological Manual (<https://ec.europa.eu/eurostat/web/products-manuals-and-guidelines/w/ks-gq-24-003>). Based on the manual, this package provides functions to access and work with HICP data from Eurostat's public database (<https://ec.europa.eu/eurostat/data/database>).

Maintained by Sebastian Weinand. Last updated 8 months ago.

consumer-price-index inflation prices statistics

4.5 match 2 stars 4.60 score 6 scripts

slwu89

MicroMoB:Discrete Time Simulation of Mosquito-Borne Pathogen Transmission

Provides a framework based on S3 dispatch for constructing models of mosquito-borne pathogen transmission which are constructed from submodels of various components (i.e. immature and adult mosquitoes, human populations). A consistent mathematical expression for the distribution of bites on hosts means that different models (stochastic, deterministic, etc.) can be coherently incorporated and updated over a discrete time step.

Maintained by Sean L. Wu. Last updated 2 years ago.

5.0 match 4.16 score 32 scripts

braverock

PortfolioAnalytics:Portfolio Analysis, Including Numerical Methods for Optimization of Portfolios

Portfolio optimization and analysis routines and graphics.

Maintained by Brian G. Peterson. Last updated 3 months ago.

1.8 match 81 stars 11.49 score 626 scripts 2 dependents

bioc

EBImage:Image processing and analysis toolbox for R

EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.

Maintained by Andrzej Oleś. Last updated 5 months ago.

visualization bioinformatics image-analysis image-processing cpp

1.6 match 71 stars 12.89 score 1.5k scripts 33 dependents

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 28 days ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

1.8 match 55 stars 11.77 score 1.2k scripts 2 dependents

dots26

MaOEA:Many Objective Evolutionary Algorithm

A set of evolutionary algorithms to solve many-objective optimization. Hybridization between the algorithms are also facilitated. Available algorithms are: 'SMS-EMOA' <doi:10.1016/j.ejor.2006.08.008> 'NSGA-III' <doi:10.1109/TEVC.2013.2281535> 'MO-CMA-ES' <doi:10.1145/1830483.1830573> The following many-objective benchmark problems are also provided: 'DTLZ1'-'DTLZ4' from Deb, et al. (2001) <doi:10.1007/1-84628-137-7_6> and 'WFG4'-'WFG9' from Huband, et al. (2005) <doi:10.1109/TEVC.2005.861417>.

Maintained by Dani Irawan. Last updated 2 years ago.

5.4 match 6 stars 3.78 score 2 scripts

hofnerb

papeR:A Toolbox for Writing Pretty Papers and Reports

A toolbox for writing 'knitr', 'Sweave' or other 'LaTeX'- or 'markdown'-based reports and to prettify the output of various estimated models.

Maintained by Benjamin Hofner. Last updated 4 years ago.

knitr latex r-language reporting reproducible reproducible-research sweave

2.8 match 30 stars 7.30 score 223 scripts 1 dependents

kviswana

ezEDA:Task Oriented Interface for Exploratory Data Analysis

Enables users to create visualizations using functions based on the data analysis task rather than on plotting mechanics. It hides the details of the individual 'ggplot2' function calls and allows the user to focus on the end goal. Useful for quick preliminary explorations. Provides functions for common exploration patterns. Some of the ideas in this package are motivated by Fox (2015, ISBN:1938377052).

Maintained by Viswa Viswanathan. Last updated 4 years ago.

5.5 match 3.70 score 4 scripts

janmarvin

openxlsx2:Read, Write and Edit 'xlsx' Files

Simplifies the creation of 'xlsx' files by providing a high level interface to writing, styling and editing worksheets.

Maintained by Jan Marvin Garbuszus. Last updated 16 hours ago.

xlsx cpp

1.5 match 138 stars 13.67 score 194 scripts 11 dependents

r4ss

r4ss:R Code for Stock Synthesis

A collection of R functions for use with Stock Synthesis, a fisheries stock assessment modeling platform written in ADMB by Dr. Richard D. Methot at the NOAA Northwest Fisheries Science Center. The functions include tools for summarizing and plotting results, manipulating files, visualizing model parameterizations, and various other common stock assessment tasks. This version of '{r4ss}' is compatible with Stock Synthesis versions 3.24 through 3.30 (specifically version 3.30.23.1, from December 2024). Support for 3.24 models is only through the core functions for reading output and plotting.

Maintained by Ian G. Taylor. Last updated 4 days ago.

fisheries fisheries-stock-assessment stock-synthesis

1.8 match 43 stars 11.38 score 1.0k scripts 2 dependents

epiforecasts

scoringutils:Utilities for Scoring and Assessing Predictions

Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.

Maintained by Nikos Bosse. Last updated 12 days ago.

forecast-evaluation forecasting

1.8 match 52 stars 11.37 score 326 scripts 7 dependents

bioc

COCOA:Coordinate Covariation Analysis

COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.

Maintained by John Lawson. Last updated 5 months ago.

epigenetics dnamethylation atacseq dnaseseq methylseq methylationarray principalcomponent genomicvariation generegulation genomeannotation systemsbiology functionalgenomics chipseq sequencing immunooncology dna-methylation pca

2.9 match 10 stars 7.02 score 21 scripts

stemangiola

tidyHeatmap:A Tidy Implementation of Heatmap

This is a tidy implementation for heatmap. At the moment it is based on the (great) package 'ComplexHeatmap'. The goal of this package is to interface a tidy data frame with this powerful tool. Some of the advantages are: Row and/or columns colour annotations are easy to integrate just specifying one parameter (column names). Custom grouping of rows is easy to specify providing a grouped tbl. For example: df %>% group_by(...). Labels size adjusted by row and column total number. Default use of Brewer and Viridis palettes.

Maintained by Stefano Mangiola. Last updated 1 months ago.

assaydomain infrastructure brewer complexheatmap custom-palette dplyr graphviz heatmap mtcars plotting rstudio scale tibble tidy tidy-data-frame tidybulk tidyverse viridis

2.0 match 335 stars 10.23 score 197 scripts 1 dependents

gianmarcoalberti

CAinterprTools:Graphical Aid in Correspondence Analysis Interpretation and Significance Testings

Allows to plot a number of information related to the interpretation of Correspondence Analysis' results. It provides the facility to plot the contribution of rows and columns categories to the principal dimensions, the quality of points display on selected dimensions, the correlation of row and column categories to selected dimensions, etc. It also allows to assess which dimension(s) is important for the data structure interpretation by means of different statistics and tests. The package also offers the facility to plot the permuted distribution of the table total inertia as well as of the inertia accounted for by pairs of selected dimensions. Different facilities are also provided that aim to produce interpretation-oriented scatterplots. Reference: Alberti 2015 <doi:10.1016/j.softx.2015.07.001>.

Maintained by Gianmarco Alberti. Last updated 5 years ago.

8.1 match 2.52 score 33 scripts

clbustos

dominanceanalysis:Dominance Analysis

Dominance analysis is a method that allows to compare the relative importance of predictors in multiple regression models: ordinary least squares, generalized linear models, hierarchical linear models, beta regression and dynamic linear models. The main principles and methods of dominance analysis are described in Budescu, D. V. (1993) <doi:10.1037/0033-2909.114.3.542> and Azen, R., & Budescu, D. V. (2003) <doi:10.1037/1082-989X.8.2.129> for ordinary least squares regression. Subsequently, the extensions for multivariate regression, logistic regression and hierarchical linear models were described in Azen, R., & Budescu, D. V. (2006) <doi:10.3102/10769986031002157>, Azen, R., & Traxel, N. (2009) <doi:10.3102/1076998609332754> and Luo, W., & Azen, R. (2013) <doi:10.3102/1076998612458319>, respectively.

Maintained by Claudio Bustos Navarrete. Last updated 1 years ago.

3.5 match 25 stars 5.75 score 45 scripts

welch-lab

rliger:Linked Inference of Genomic Experimental Relationships

Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.

Maintained by Yichen Wang. Last updated 2 months ago.

nonnegative-matrix-factorization single-cell openblas cpp

1.9 match 402 stars 10.80 score 334 scripts 1 dependents

bioc

mistyR:Multiview Intercellular SpaTial modeling framework

mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.

Maintained by Jovan Tanevski. Last updated 5 months ago.

software biomedicalinformatics cellbiology systemsbiology regression decisiontree singlecell spatial bioconductor biology intercellular machine-learning modular molecular-biology multiview spatial-transcriptomics

2.6 match 51 stars 7.87 score 160 scripts

dipterix

dipsaus:A Dipping Sauce for Data Analysis and Visualizations

Works as an "add-on" to packages like 'shiny', 'future', as well as 'rlang', and provides utility functions. Just like dipping sauce adding flavors to potato chips or pita bread, 'dipsaus' for data analysis and visualizations adds handy functions and enhancements to popular packages. The goal is to provide simple solutions that are frequently asked for online, such as how to synchronize 'shiny' inputs without freezing the app, or how to get memory size on 'Linux' or 'MacOS' system. The enhancements roughly fall into these four categories: 1. 'shiny' input widgets; 2. high-performance computing using the 'future' package; 3. modify R calls and convert among numbers, strings, and other objects. 4. utility functions to get system information such like CPU chip-set, memory limit, etc.

Maintained by Zhengjia Wang. Last updated 4 days ago.

cpp

2.5 match 13 stars 7.90 score 85 scripts 3 dependents

bioc

MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics

MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.

Maintained by Laurent Gatto. Last updated 1 days ago.

immunooncology infrastructure proteomics massspectrometry qualitycontrol dataimport bioconductor bioinformatics mass-spectrometry proteomics-data visualisation cpp

1.5 match 130 stars 12.81 score 772 scripts 36 dependents

boost-r

FDboost:Boosting Functional Regression Models

Regression models for functional data, i.e., scalar-on-function, function-on-scalar and function-on-function regression models, are fitted by a component-wise gradient boosting algorithm. For a manual on how to use 'FDboost', see Brockhaus, Ruegamer, Greven (2017) <doi:10.18637/jss.v094.i10>.

Maintained by David Ruegamer. Last updated 3 months ago.

boosting boosting-algorithms function-on-function-regression function-on-scalar-regression machine-learning scalar-on-function-regression variable-selection

2.5 match 17 stars 8.00 score 98 scripts

hturner

gnm:Generalized Nonlinear Models

Functions to specify and fit generalized nonlinear models, including models with multiplicative interaction terms such as the UNIDIFF model from sociology and the AMMI model from crop science, and many others. Over-parameterized representations of models are used throughout; functions are provided for inference on estimable parameter combinations, as well as standard methods for diagnostics etc.

Maintained by Heather Turner. Last updated 1 years ago.

generalized-linear-models generalized-nonlinear-models statistical-models openblas

1.9 match 16 stars 10.51 score 290 scripts 21 dependents

nimble-dev

nimble:MCMC, Particle Filtering, and Programmable Hierarchical Modeling

A system for writing hierarchical statistical models largely compatible with 'BUGS' and 'JAGS', writing nimbleFunctions to operate models and do basic R-style math, and compiling both models and nimbleFunctions via custom-generated C++. 'NIMBLE' includes default methods for MCMC, Laplace Approximation, Monte Carlo Expectation Maximization, and some other tools. The nimbleFunction system makes it easy to do things like implement new MCMC samplers from R, customize the assignment of samplers to different parts of a model from R, and compile the new samplers automatically via C++ alongside the samplers 'NIMBLE' provides. 'NIMBLE' extends the 'BUGS'/'JAGS' language by making it extensible: New distributions and functions can be added, including as calls to external compiled code. Although most people think of MCMC as the main goal of the 'BUGS'/'JAGS' language for writing models, one can use 'NIMBLE' for writing arbitrary other kinds of model-generic algorithms as well. A full User Manual is available at <https://r-nimble.org>.

Maintained by Christopher Paciorek. Last updated 4 days ago.

bayesian-inference bayesian-methods hierarchical-models mcmc probabilistic-programming openblas cpp

1.5 match 169 stars 12.97 score 2.6k scripts 19 dependents

bayesiandemography

bage:Bayesian Estimation and Forecasting of Age-Specific Rates

Fast Bayesian estimation and forecasting of age-specific rates, probabilities, and means, based on 'Template Model Builder'.

Maintained by John Bryant. Last updated 2 months ago.

cpp

2.7 match 3 stars 7.30 score 39 scripts

tvganesh

cricketr:Analyze Cricketers and Cricket Teams Based on ESPN Cricinfo Statsguru

Tools for analyzing performances of cricketers based on stats in ESPN Cricinfo Statsguru. The toolset can be used for analysis of Tests,ODIs and Twenty20 matches of both batsmen and bowlers. The package can also be used to analyze team performances.

Maintained by Tinniam V Ganesh. Last updated 4 years ago.

3.5 match 62 stars 5.55 score 115 scripts

bioc

MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework

MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).

Maintained by Shuangbin Xu. Last updated 5 months ago.

visualization microbiome software multiplecomparison featureextraction microbiome-analysis microbiome-data

2.0 match 183 stars 9.70 score 126 scripts 1 dependents

jranke

mkin:Kinetic Evaluation of Chemical Degradation Data

Calculation routines based on the FOCUS Kinetics Report (2006, 2014). Includes a function for conveniently defining differential equation models, model solution based on eigenvalues if possible or using numerical solvers. If a C compiler (on windows: 'Rtools') is installed, differential equation models are solved using automatically generated C functions. Non-constant errors can be taken into account using variance by variable or two-component error models <doi:10.3390/environments6120124>. Hierarchical degradation models can be fitted using nonlinear mixed-effects model packages as a back end <doi:10.3390/environments8080071>. Please note that no warranty is implied for correctness of results or fitness for a particular purpose.

Maintained by Johannes Ranke. Last updated 30 days ago.

degradation focus-kinetics kinetic-models kinetics ode ode-model

2.4 match 11 stars 8.06 score 78 scripts 1 dependents

bioc

cytomapper:Visualization of highly multiplexed imaging data in R

Highly multiplexed imaging acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualised across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualised on segmented cell areas. This package contains functions for the visualisation of multiplexed read-outs and cell-level information obtained by multiplexed imaging technologies. The main functions of this package allow 1. the visualisation of pixel-level information across multiple channels, 2. the display of cell-level information (expression and/or metadata) on segmentation masks and 3. gating and visualisation of single cells.

Maintained by Lasse Meyer. Last updated 5 months ago.

immunooncology software singlecell onechannel twochannel multiplecomparison normalization dataimport bioimaging imaging-mass-cytometry single-cell spatial-analysis

2.0 match 32 stars 9.61 score 354 scripts 5 dependents

ironholds

WikipediR:A MediaWiki API Wrapper

A wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia. It can be used to retrieve page text, information about users or the history of pages, and elements of the category tree.

Maintained by Os Keyes. Last updated 11 months ago.

api-client api-wrapper mediawiki

2.0 match 70 stars 9.56 score 81 scripts 32 dependents

inventionate

TimeSpaceAnalysis:Statistical tools for time-space analysis

Use Geometric Data Analysis approaches (e.g. MCA or MFA), time pattern analysis (see "time sequence clustering") and places chronologies (see "time geography") analysis.

Maintained by Fabian Mundt. Last updated 6 days ago.

7.7 match 2.48 score 2 scripts

jangraffelman

HardyWeinberg:Statistical Tests and Graphics for Hardy-Weinberg Equilibrium

Contains tools for exploring Hardy-Weinberg equilibrium (Hardy, 1908; Weinberg, 1908) for bi and multi-allelic genetic marker data. All classical tests (chi-square, exact, likelihood-ratio and permutation tests) with bi-allelic variants are included in the package, as well as functions for power computation and for the simulation of marker data under equilibrium and disequilibrium. Routines for dealing with markers on the X-chromosome are included (Graffelman & Weir, 2016) <doi:10.1038/hdy.2016.20>, including Bayesian procedures. Some exact and permutation procedures also work with multi-allelic variants. Special test procedures that jointly address Hardy-Weinberg equilibrium and equality of allele frequencies in both sexes are supplied, for the bi and multi-allelic case. Functions for testing equilibrium in the presence of missing data by using multiple imputation are also provided. Implements several graphics for exploring the equilibrium status of a large set of bi-allelic markers: ternary plots with acceptance regions, log-ratio plots and Q-Q plots. The functionality of the package is explained in detail in a related JSS paper <doi:10.18637/jss.v064.i03>.

Maintained by Jan Graffelman. Last updated 11 months ago.

cpp

3.0 match 6.30 score 167 scripts 4 dependents

bioc

MetaboCoreUtils:Core Utils for Metabolomics Data

MetaboCoreUtils defines metabolomics-related core functionality provided as low-level functions to allow a data structure-independent usage across various R packages. This includes functions to calculate between ion (adduct) and compound mass-to-charge ratios and masses or functions to work with chemical formulas. The package provides also a set of adduct definitions and information on some commercially available internal standard mixes commonly used in MS experiments.

Maintained by Johannes Rainer. Last updated 5 months ago.

infrastructure metabolomics massspectrometry mass-spectrometry

2.0 match 9 stars 9.40 score 58 scripts 36 dependents

jrowen

rhandsontable:Interface to the 'Handsontable.js' Library

An R interface to the 'Handsontable' JavaScript library, which is a minimalist Excel-like data grid editor. See <https://handsontable.com/> for details.

Maintained by Jonathan Owen. Last updated 3 years ago.

handsontable htmlwidgets javascript shiny sparkline

1.5 match 389 stars 12.31 score 1.0k scripts 46 dependents

bpfaff

FRAPO:Financial Risk Modelling and Portfolio Optimisation with R

Accompanying package of the book 'Financial Risk Modelling and Portfolio Optimisation with R', second edition. The data sets used in the book are contained in this package.

Maintained by Bernhard Pfaff. Last updated 8 years ago.

3.9 match 11 stars 4.71 score 94 scripts

bioc

MOFA2:Multi-Omics Factor Analysis v2

The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.

Maintained by Ricard Argelaguet. Last updated 5 months ago.

dimensionreduction bayesian visualization factor-analysis mofa multi-omics

1.8 match 319 stars 10.02 score 502 scripts

biometry

bipartite:Visualising Bipartite Networks and Calculating Some (Ecological) Indices

Functions to visualise webs and calculate a series of indices commonly used to describe pattern in (ecological) webs. It focuses on webs consisting of only two levels (bipartite), e.g. pollination webs or predator-prey-webs. Visualisation is important to get an idea of what we are actually looking at, while the indices summarise different aspects of the web's topology.

Maintained by Carsten F. Dormann. Last updated 6 days ago.

cpp

1.7 match 37 stars 10.93 score 592 scripts 15 dependents

thongphamthe

PAFit:Generative Mechanism Estimation in Temporal Complex Networks

Statistical methods for estimating preferential attachment and node fitness generative mechanisms in temporal complex networks are provided. Thong Pham et al. (2015) <doi:10.1371/journal.pone.0137796>. Thong Pham et al. (2016) <doi:10.1038/srep32558>. Thong Pham et al. (2020) <doi:10.18637/jss.v092.i03>. Thong Pham et al. (2021) <doi:10.1093/comnet/cnab024>.

Maintained by Thong Pham. Last updated 12 months ago.

complex-networks fit-get-richer general-preferential-attachment minorize-maximization preferential-attachment rich-get-richer scale-free temporal-networks cpp openmp

2.8 match 17 stars 6.47 score 70 scripts

bioc

podkat:Position-Dependent Kernel Association Test

This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.

Maintained by Ulrich Bodenhofer. Last updated 5 months ago.

genetics wholegenome annotation variantannotation sequencing dataimport curl bzip2 xz-utils zlib cpp

3.5 match 5.02 score 6 scripts

shixiangwang

sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations

Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called 'signature' (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.

Maintained by Shixiang Wang. Last updated 5 months ago.

bayesian-nmf bioinformatics cancer-research cnv copynumber-signatures cosmic-signatures dbs easy-to-use indel mutational-signatures nmf nmf-extraction sbs signature-extraction somatic-mutations somatic-variants visualization cpp

1.9 match 150 stars 9.48 score 123 scripts 2 dependents

bioc

assorthead:Assorted Header-Only C++ Libraries

Vendors an assortment of useful header-only C++ libraries. Bioconductor packages can use these libraries in their own C++ code by LinkingTo this package without introducing any additional dependencies. The use of a central repository avoids duplicate vendoring of libraries across multiple R packages, and enables better coordination of version updates across cohorts of interdependent C++ libraries.

Maintained by Aaron Lun. Last updated 12 days ago.

singlecell qualitycontrol normalization datarepresentation dataimport differentialexpression alignment

2.0 match 8.89 score 167 dependents

statmanrobin

Stat2Data:Datasets for Stat2

Datasets for the textbook Stat2: Modeling with Regression and ANOVA (second edition). The package also includes data for the first edition, Stat2: Building Models for a World of Data and a few functions for plotting diagnostics.

Maintained by Robin Lock. Last updated 6 years ago.

3.6 match 5 stars 4.94 score 544 scripts

laresbernardo

lares:Analytics & Machine Learning Sidekick

Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.

Maintained by Bernardo Lares. Last updated 23 days ago.

analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle rlanguage robyn visualization

1.8 match 233 stars 9.84 score 185 scripts 1 dependents

rudjer

SparseM:Sparse Linear Algebra

Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.

Maintained by Roger Koenker. Last updated 8 months ago.

fortran

1.5 match 3 stars 11.47 score 306 scripts 1.5k dependents

kdevkdev

rineq:Concentration Index and Decomposition for Health Inequalities

Relative, generalized, and Erreygers corrected concentration index; plot Lorenz curves; and decompose health inequalities into contributing factors. The package currently works with (generalized) linear models, survival models, complex survey models, and marginal effects probit models. originally forked by Brecht Devleesschauwer from the 'decomp' package (no longer on CRAN), 'rineq' is now maintained by Kaspar Walter Meili. Compared to the earlier 'rineq' version on 'github' by Brecht Devleesschauwer (<https://github.com/brechtdv/rineq>), the regression tree functionality has been removed. Improvements compared to earlier versions include improved plotting of decomposition and concentration, added functionality to calculate the concentration index with different methods, calculation of robust standard errors, and support for the decomposition analysis using marginal effects probit regression models. The development version is available at <https://github.com/kdevkdev/rineq>.

Maintained by Kaspar Meili. Last updated 2 months ago.

5.1 match 1 stars 3.48 score 2 scripts

bioc

BiocBaseUtils:General utility functions for developing Bioconductor packages

The package provides utility functions related to package development. These include functions that replace slots, and selectors for show methods. It aims to coalesce the various helper functions often re-used throughout the Bioconductor ecosystem.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure bioconductor-package core-package

2.0 match 4 stars 8.78 score 3 scripts 158 dependents

agalecki

nlmeU:Datasets and Utility Functions Enhancing Functionality of 'nlme' Package

Datasets and utility functions enhancing functionality of nlme package. Datasets, functions and scripts are described in book titled 'Linear Mixed-Effects Models: A Step-by-Step Approach' by Galecki and Burzykowski (2013). Package is under development.

Maintained by Andrzej Galecki. Last updated 3 years ago.

3.4 match 5.08 score 135 scripts 6 dependents

adeverse

adegraphics:An S4 Lattice-Based Package for the Representation of Multivariate Data

Graphical functionalities for the representation of multivariate data. It is a complete re-implementation of the functions available in the 'ade4' package.

Maintained by Aurélie Siberchicot. Last updated 8 months ago.

1.7 match 9 stars 10.37 score 386 scripts 6 dependents

aphalo

photobiologyFilters:Spectral Transmittance and Spectral Reflectance Data

Spectral 'transmittance' data for frequently used filters and similar materials. Plastic sheets and films; photography filters; theatrical gels; machine-vision filters; various types of window glass; optical glass and some laboratory plastics and glassware. Spectral reflectance data for frequently encountered materials. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 5 days ago.

3.4 match 5.08 score 40 scripts

inlabru-org

fmesher:Triangle Meshes and Related Geometry Tools

Generate planar and spherical triangle meshes, compute finite element calculations for 1- and 2-dimensional flat and curved manifolds with associated basis function spaces, methods for lines and polygons, and transparent handling of coordinate reference systems and coordinate transformation, including 'sf' and 'sp' geometries. The core 'fmesher' library code was originally part of the 'INLA' package, and implements parts of "Triangulations and Applications" by Hjelle and Daehlen (2006) <doi:10.1007/3-540-33261-8>.

Maintained by Finn Lindgren. Last updated 2 days ago.

cpp

1.5 match 16 stars 11.18 score 261 scripts 26 dependents

bioc

affy:Methods for Affymetrix Oligonucleotide Arrays

The package contains functions for exploratory oligonucleotide array analysis. The dependence on tkWidgets only concerns few convenience functions. 'affy' is fully functional without it.

Maintained by Robert D. Shear. Last updated 2 months ago.

microarray onechannel preprocessing

1.5 match 11.12 score 2.5k scripts 98 dependents

elies-ramon

kerntools:Kernel Functions and Tools for Machine Learning Applications

Kernel functions for diverse types of data (including, but not restricted to: nonnegative and real vectors, real matrices, categorical and ordinal variables, sets, strings), plus other utilities like kernel similarity, kernel Principal Components Analysis (PCA) and features' importance for Support Vector Machines (SVMs), which expand other 'R' packages like 'kernlab'.

Maintained by Elies Ramon. Last updated 25 days ago.

kernel-methods pca

3.5 match 1 stars 4.73 score 12 scripts

bioc

BASiCS:Bayesian Analysis of Single-Cell Sequencing data

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.

Maintained by Catalina Vallejos. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell differentialexpression bayesian cellbiology bioconductor-package gene-expression rcpp rcpparmadillo scrna-seq single-cell openblas cpp openmp

1.6 match 83 stars 10.26 score 368 scripts 1 dependents

tvganesh

yorkr:Analyze Cricket Performances Based on Data from Cricsheet

Analyzing performances of cricketers and cricket teams based on 'yaml' match data from Cricsheet <https://cricsheet.org/>.

Maintained by Tinniam V Ganesh. Last updated 2 years ago.

3.3 match 17 stars 5.00 score 118 scripts

dami82

mutSignatures:Decipher Mutational Signatures from Somatic Mutational Catalogs

Cancer cells accumulate DNA mutations as result of DNA damage and DNA repair processes. This computational framework is aimed at deciphering DNA mutational signatures operating in cancer. The framework includes modules that support raw data import and processing, mutational signature extraction, and results interpretation and visualization. The framework accepts widely used file formats storing information about DNA variants, such as Variant Call Format files. The framework performs Non-Negative Matrix Factorization to extract mutational signatures explaining the observed set of DNA mutations. Bootstrapping is performed as part of the analysis. The framework supports parallelization and is optimized for use on multi-core systems. The software was described by Fantini D et al (2020) <doi:10.1038/s41598-020-75062-0> and is based on a custom R-based implementation of the original MATLAB WTSI framework by Alexandrov LB et al (2013) <doi:10.1016/j.celrep.2012.12.008>.

Maintained by Damiano Fantini. Last updated 2 years ago.

2.8 match 14 stars 5.83 score 48 scripts

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 1 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

1.3 match 109 stars 13.20 score 342 scripts 3 dependents

damianobaldan

riverconn:Fragmentation and Connectivity Indices for Riverscapes

Indices for assessing riverscape fragmentation, including the Dendritic Connectivity Index, the Population Connectivity Index, the River Fragmentation Index, the Probability of Connectivity, and the Integral Index of connectivity. For a review, see Jumani et al. (2020) <doi:10.1088/1748-9326/abcb37> and Baldan et al. (2022) <doi:10.1016/j.envsoft.2022.105470> Functions to calculate temporal indices improvement when fragmentation due to barriers is reduced are also included.

Maintained by Damiano Baldan. Last updated 12 months ago.

3.4 match 9 stars 4.77 score 13 scripts

rekyt

fdcoexist:Multi-Species Trait-Based Coexistence Model in Discrete time

A modified Beverton-Holt model used in the Denelle, Grenié et al. manuscript that expresses environmental filtering, limiting similarity and hierarchical competition explicitely in function of species traits. This package provides all the code necessary to rerun the analyses of the manuscript.

Maintained by Matthias Grenié. Last updated 2 years ago.

6.0 match 2.70 score 1 scripts

luisagi

enmpa:Ecological Niche Modeling using Presence-Absence Data

A set of tools to perform Ecological Niche Modeling with presence-absence data. It includes algorithms for data partitioning, model fitting, calibration, evaluation, selection, and prediction. Other functions help to explore signals of ecological niche using univariate and multivariate analyses, and model features such as variable response curves and variable importance. Unique characteristics of this package are the ability to exclude models with concave quadratic responses, and the option to clamp model predictions to specific variables. These tools are implemented following principles proposed in Cobos et al., (2022) <doi:10.17161/bi.v17i.15985>, Cobos et al., (2019) <doi:10.7717/peerj.6281>, and Peterson et al., (2008) <doi:10.1016/j.ecolmodel.2007.11.008>.

Maintained by Luis F. Arias-Giraldo. Last updated 3 months ago.

cpp

3.8 match 5 stars 4.35 score 5 scripts

bioc

decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting

Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.

Maintained by Rosario M. Piro. Last updated 5 months ago.

software snp sequencing dnaseq genomicvariation somaticmutation biomedicalinformatics genetics biologicalquestion statisticalmethod

3.4 match 1 stars 4.78 score 10 scripts 1 dependents

hypertidy

PROJ:Generic Coordinate System Transformations Using 'PROJ'

A wrapper around the generic coordinate transformation software 'PROJ' that transforms coordinates from one coordinate reference system ('CRS') to another. This includes cartographic projections as well as geodetic transformations. The intention is for this package to be used by user-packages such as 'reproj', and that the older 'PROJ.4' and version 5 pathways be provided by the 'proj4' package.

Maintained by Michael D. Sumner. Last updated 9 months ago.

proj

1.5 match 16 stars 10.53 score 82 scripts 27 dependents

robustport

facmodTS:Time Series Models for Asset Returns

Supports teaching methods of estimating and testing time series models for use in robust portfolio construction and analysis. Unique in providing not only classical least squares, but also modern robust model fitting methods which are not much influenced by outliers. Includes returns and risk decompositions, with user choice of standard deviation, value-at-risk, and expected shortfall risk measures. "Robust Statistics Theory and Methods (with R)", R. A. Maronna, R. D. Martin, V. J. Yohai, M. Salibian-Barrera (2019) <doi:10.1002/9781119214656>.

Maintained by Doug Martin. Last updated 8 days ago.

5.3 match 1 stars 3.00 score

leelabsg

SKAT:SNP-Set (Sequence) Kernel Association Test

Functions for kernel-regression-based association tests including Burden test, SKAT and SKAT-O. These methods aggregate individual SNP score statistics in a SNP set and efficiently compute SNP-set level p-values.

Maintained by Seunggeun (Shawn) Lee. Last updated 1 months ago.

sequence cpp

1.7 match 45 stars 9.70 score 268 scripts 16 dependents

nanne-aben

TANDEM:A Two-Stage Approach to Maximize Interpretability of Drug Response Models Based on Multiple Molecular Data Types

A two-stage regression method that can be used when various input data types are correlated, for example gene expression and methylation in drug response prediction. In the first stage it uses the upstream features (such as methylation) to predict the response variable (such as drug response), and in the second stage it uses the downstream features (such as gene expression) to predict the residuals of the first stage. In our manuscript (Aben et al., 2016, <doi:10.1093/bioinformatics/btw449>), we show that using TANDEM prevents the model from being dominated by gene expression and that the features selected by TANDEM are more interpretable.

Maintained by Nanne Aben. Last updated 5 years ago.

4.0 match 2 stars 4.00 score 9 scripts

berwinturlach

quadprog:Functions to Solve Quadratic Programming Problems

This package contains routines and documentation for solving quadratic programming problems.

Maintained by Berwin A. Turlach. Last updated 5 years ago.

fortran openblas

1.6 match 3 stars 10.27 score 972 scripts 1.2k dependents

venelin

POUMM:The Phylogenetic Ornstein-Uhlenbeck Mixed Model

The Phylogenetic Ornstein-Uhlenbeck Mixed Model (POUMM) allows to estimate the phylogenetic heritability of continuous traits, to test hypotheses of neutral evolution versus stabilizing selection, to quantify the strength of stabilizing selection, to estimate measurement error and to make predictions about the evolution of a phenotype and phenotypic variation in a population. The package implements combined maximum likelihood and Bayesian inference of the univariate Phylogenetic Ornstein-Uhlenbeck Mixed Model, fast parallel likelihood calculation, maximum likelihood inference of the genotypic values at the tips, functions for summarizing and plotting traces and posterior samples, functions for simulation of a univariate continuous trait evolution model along a phylogenetic tree. So far, the package has been used for estimating the heritability of quantitative traits in macroevolutionary and epidemiological studies, see e.g. Bertels et al. (2017) <doi:10.1093/molbev/msx246> and Mitov and Stadler (2018) <doi:10.1093/molbev/msx328>. The algorithm for parallel POUMM likelihood calculation has been published in Mitov and Stadler (2019) <doi:10.1111/2041-210X.13136>.

Maintained by Venelin Mitov. Last updated 4 years ago.

cpp

3.2 match 4 stars 4.94 score 22 scripts

predictiveecology

SpaDES.core:Core Utilities for Developing and Running Spatially Explicit Discrete Event Models

Provides the core framework for a discrete event system to implement a complete data-to-decisions, reproducible workflow. The core components facilitate the development of modular pieces, and enable the user to include additional functionality by running user-built modules. Includes conditional scheduling, restart after interruption, packaging of reusable modules, tools for developing arbitrary automated workflows, automated interweaving of modules of different temporal resolution, and tools for visualizing and understanding the within-project dependencies. The suggested package 'NLMR' can be installed from the repository (<https://PredictiveEcology.r-universe.dev>).

Maintained by Eliot J B McIntire. Last updated 18 days ago.

discrete-events-simulations simulation-framework simulation-modeling

1.5 match 10 stars 10.61 score 142 scripts 6 dependents

epiverse-trace

simulist:Simulate Disease Outbreak Line List and Contacts Data

Tools to simulate realistic raw case data for an epidemic in the form of line lists and contacts using a branching process. Simulated outbreaks are parameterised with epidemiological parameters and can have age-structured populations, age-stratified hospitalisation and death risk and time-varying case fatality risk.

Maintained by Joshua W. Lambert. Last updated 2 days ago.

epidemiology epiverse linelist outbreaks

2.0 match 9 stars 7.86 score 27 scripts

xiangzhou09

rbw:Residual Balancing Weights for Marginal Structural Models

Residual balancing is a robust method of constructing weights for marginal structural models, which can be used to estimate (a) the average treatment effect in a cross-sectional observational study, (b) controlled direct/mediator effects in causal mediation analysis, and (c) the effects of time-varying treatments in panel data (Zhou and Wodtke 2020 <doi:10.1017/pan.2020.2>). This package provides three functions, rbwPoint(), rbwMed(), and rbwPanel(), that produce residual balancing weights for estimating (a), (b), (c), respectively.

Maintained by Xiang Zhou. Last updated 3 years ago.

3.4 match 9 stars 4.65 score 5 scripts

vanduttran

ConsensusOPLS:Consensus OPLS for Multi-Block Data Fusion

Merging data from multiple sources is a relevant approach for comprehensively evaluating complex systems. However, the inherent problems encountered when analyzing single tables are amplified with the generation of multi-block datasets, and finding the relationships between data layers of increasing complexity constitutes a challenging task. For that purpose, a generic methodology is proposed by combining the strength of established data analysis strategies, i.e. multi-block approaches and the Orthogonal Partial Least Squares (OPLS) framework to provide an efficient tool for the fusion of data obtained from multiple sources. The package enables quick and efficient implementation of the consensus OPLS model for any horizontal multi-block data structures (observation-based matching). Moreover, it offers an interesting range of metrics and graphics to help to determine the optimal number of components and check the validity of the model through permutation tests. Interpretation tools include score and loading plots, Variable Importance in Projection (VIP), functionality predict for SHAP computing, and performance coefficients such as R2, Q2, and DQ2 coefficients. J. Boccard and D.N. Rutledge (2013) <doi:10.1016/j.aca.2013.01.022>.

Maintained by Van Du T. Tran. Last updated 17 days ago.

6.8 match 2.30 score 9 scripts

ikosmidis

brglm2:Bias Reduction in Generalized Linear Models

Estimation and inference from generalized linear models based on various methods for bias reduction and maximum penalized likelihood with powers of the Jeffreys prior as penalty. The 'brglmFit' fitting method can achieve reduction of estimation bias by solving either the mean bias-reducing adjusted score equations in Firth (1993) <doi:10.1093/biomet/80.1.27> and Kosmidis and Firth (2009) <doi:10.1093/biomet/asp055>, or the median bias-reduction adjusted score equations in Kenne et al. (2017) <doi:10.1093/biomet/asx046>, or through the direct subtraction of an estimate of the bias of the maximum likelihood estimator from the maximum likelihood estimates as in Cordeiro and McCullagh (1991) <https://www.jstor.org/stable/2345592>. See Kosmidis et al (2020) <doi:10.1007/s11222-019-09860-6> for more details. Estimation in all cases takes place via a quasi Fisher scoring algorithm, and S3 methods for the construction of of confidence intervals for the reduced-bias estimates are provided. In the special case of generalized linear models for binomial and multinomial responses (both ordinal and nominal), the adjusted score approaches to mean and media bias reduction have been found to return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation; see Kosmidis and Firth, 2020 <doi:10.1093/biomet/asaa052>, for a proof for mean bias reduction in logistic regression).

Maintained by Ioannis Kosmidis. Last updated 6 months ago.

adjusted-score-equations algorithms bias-reducing-adjustments bias-reduction estimation glm logistic-regression nominal-responses ordinal-responses regression regression-algorithms statistics

1.5 match 32 stars 10.41 score 106 scripts 10 dependents

bioc

PhyloProfile:PhyloProfile

PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.

Maintained by Vinh Tran. Last updated 6 days ago.

software visualization datarepresentation multiplecomparison functionalprediction dimensionreduction bioinformatics heatmap interactive-visualizations orthologs phylogenetic-profile shiny

2.0 match 33 stars 7.77 score 10 scripts

jclavel

mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data

Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.

Maintained by Julien Clavel. Last updated 1 months ago.

openblas

1.6 match 17 stars 9.46 score 189 scripts 3 dependents

fabsig

gpboost:Combining Tree-Boosting with Gaussian Process and Mixed Effects Models

An R package that allows for combining tree-boosting with Gaussian process and mixed effects models. It also allows for independently doing tree-boosting as well as inference and prediction for Gaussian process and mixed effects models. See <https://github.com/fabsig/GPBoost> for more information on the software and Sigrist (2022, JMLR) <https://www.jmlr.org/papers/v23/20-322.html> and Sigrist (2023, TPAMI) <doi:10.1109/TPAMI.2022.3168152> for more information on the methodology.

Maintained by Fabio Sigrist. Last updated 25 days ago.

cpp openmp

3.7 match 4.20 score 212 scripts

paulnorthrop

lax:Loglikelihood Adjustment for Extreme Value Models

Performs adjusted inferences based on model objects fitted, using maximum likelihood estimation, by the extreme value analysis packages 'eva' <https://cran.r-project.org/package=eva>, 'evd' <https://cran.r-project.org/package=evd>, 'evir' <https://cran.r-project.org/package=evir>, 'extRemes' <https://cran.r-project.org/package=extRemes>, 'fExtremes' <https://cran.r-project.org/package=fExtremes>, 'ismev' <https://cran.r-project.org/package=ismev>, 'mev' <https://cran.r-project.org/package=mev>, 'POT' <https://cran.r-project.org/package=POT> and 'texmex' <https://cran.r-project.org/package=texmex>. Adjusted standard errors and an adjusted loglikelihood are provided, using the 'chandwich' package <https://cran.r-project.org/package=chandwich> and the object-oriented features of the 'sandwich' package <https://cran.r-project.org/package=sandwich>. The adjustment is based on a robust sandwich estimator of the parameter covariance matrix, based on the methodology in Chandler and Bate (2007) <doi:10.1093/biomet/asm015>. This can be used for cluster correlated data when interest lies in the parameters of the marginal distributions, or for performing inferences that are robust to certain types of model misspecification. Univariate extreme value models, including regression models, are supported.

Maintained by Paul J. Northrop. Last updated 1 years ago.

clustered-data clusters composite-likelihood evd extreme-value-analysis extreme-value-statistics extremes independence-loglikelihood loglikelihood-adjustment mle pot regression regression-modelling robust sandwich sandwich-estimator

3.6 match 3 stars 4.29 score 13 scripts

bblonder

hypervolume:High Dimensional Geometry, Set Operations, Projection, and Inference Using Kernel Density Estimation, Support Vector Machines, and Convex Hulls

Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.

Maintained by Benjamin Blonder. Last updated 2 months ago.

openblas cpp

1.6 match 23 stars 9.75 score 211 scripts 7 dependents

pharmaverse

sdtmchecks:Data Quality Checks for Study Data Tabulation Model (SDTM) Datasets

A series of checks to identify common issues in Study Data Tabulation Model (SDTM) datasets. These checks are intended to be generalizable, actionable, and meaningful for analysis.

Maintained by Will Harris. Last updated 3 months ago.

2.0 match 21 stars 7.66 score 15 scripts

r-gregmisc

gmodels:Various R Programming Tools for Model Fitting

Various R programming tools for model fitting.

Maintained by Gregory R. Warnes. Last updated 3 months ago.

1.5 match 1 stars 10.01 score 3.5k scripts 30 dependents

bnaras

multiview:Cooperative Learning for Multi-View Analysis

Cooperative learning combines the usual squared error loss of predictions with an agreement penalty to encourage the predictions from different data views to agree. By varying the weight of the agreement penalty, we get a continuum of solutions that include the well-known early and late fusion approaches. Cooperative learning chooses the degree of agreement (or fusion) in an adaptive manner, using a validation set or cross-validation to estimate test set prediction error. In the setting of cooperative regularized linear regression, the method combines the lasso penalty with the agreement penalty (Ding, D., Li, S., Narasimhan, B., Tibshirani, R. (2021) <doi:10.1073/pnas.2202113119>).

Maintained by Balasubramanian Narasimhan. Last updated 2 years ago.

cpp

5.2 match 2.95 score 18 scripts

bioc

Rsubread:Mapping, quantification and variant analysis of sequencing data

Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.

Maintained by Wei Shi. Last updated 1 days ago.

sequencing alignment sequencematching rnaseq chipseq singlecell geneexpression generegulation genetics immunooncology snp geneticvariability preprocessing qualitycontrol genomeannotation genefusiondetection indeldetection variantannotation variantdetection multiplesequencealignment zlib

1.7 match 9.24 score 892 scripts 10 dependents

bczernecki

climate:Interface to Download Meteorological (and Hydrological) Datasets

Automatize downloading of meteorological and hydrological data from publicly available repositories: OGIMET (<http://ogimet.com/index.phtml.en>), University of Wyoming - atmospheric vertical profiling data (<http://weather.uwyo.edu/upperair/>), Polish Institute of Meteorology and Water Management - National Research Institute (<https://danepubliczne.imgw.pl>), and National Oceanic & Atmospheric Administration (NOAA). This package also allows for searching geographical coordinates for each observation and calculate distances to the nearest stations.

Maintained by Bartosz Czernecki. Last updated 9 days ago.

climate climate-data imgw meteorological-data meteorology noaa-data ogimet sounding

2.0 match 88 stars 7.61 score 38 scripts

covaruber

evola:Evolutionary Algorithm

Runs a genetic algorithm using the 'AlphaSimR' machinery <doi:10.1093/g3journal/jkaa017> and the coalescent simulator 'MaCS' <doi:10.1101/gr.083634.108>.

Maintained by Giovanny Covarrubias-Pazaran. Last updated 5 days ago.

3.6 match 1 stars 4.23 score 3 scripts

bioc

imcRtools:Methods for imaging mass cytometry data analysis

This R package supports the handling and analysis of imaging mass cytometry and other highly multiplexed imaging data. The main functionality includes reading in single-cell data after image segmentation and measurement, data formatting to perform channel spillover correction and a number of spatial analysis approaches. First, cell-cell interactions are detected via spatial graph construction; these graphs can be visualized with cells representing nodes and interactions representing edges. Furthermore, per cell, its direct neighbours are summarized to allow spatial clustering. Per image/grouping level, interactions between types of cells are counted, averaged and compared against random permutations. In that way, types of cells that interact more (attraction) or less (avoidance) frequently than expected by chance are detected.

Maintained by Daniel Schulz. Last updated 5 months ago.

immunooncology singlecell spatial dataimport clustering imc single-cell

2.0 match 24 stars 7.58 score 126 scripts

bioc

tradeSeq:trajectory-based differential expression analysis for sequencing data

tradeSeq provides a flexible method for fitting regression models that can be used to find genes that are differentially expressed along one or multiple lineages in a trajectory. Based on the fitted models, it uses a variety of tests suited to answer different questions of interest, e.g. the discovery of genes for which expression is associated with pseudotime, or which are differentially expressed (in a specific region) along the trajectory. It fits a negative binomial generalized additive model (GAM) for each gene, and performs inference on the parameters of the GAM.

Maintained by Hector Roux de Bezieux. Last updated 5 months ago.

clustering regression timecourse differentialexpression geneexpression rnaseq sequencing software singlecell transcriptomics multiplecomparison visualization

1.5 match 247 stars 10.06 score 440 scripts

adibender

pammtools:Piece-Wise Exponential Additive Mixed Modeling Tools for Survival Analysis

The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated, competing risks and recurrent events data. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.

Maintained by Andreas Bender. Last updated 2 months ago.

additive-models pamm pammtools piece-wise-exponential survival-analysis

1.7 match 48 stars 8.78 score 310 scripts 8 dependents

epiverse-trace

epidemics:Composable Epidemic Scenario Modelling

A library of compartmental epidemic models taken from the published literature, and classes to represent affected populations, public health response measures including non-pharmaceutical interventions on social contacts, non-pharmaceutical and pharmaceutical interventions that affect disease transmissibility, vaccination regimes, and disease seasonality, which can be combined to compose epidemic scenario models.

Maintained by Rosalind Eggo. Last updated 9 months ago.

decision-support epidemic-modelling epidemic-simulations epidemiology epiverse infectious-disease-dynamics model-library non-pharmaceutical-interventions rcpp rcppeigen scenario-analysis vaccination cpp

2.0 match 9 stars 7.48 score 59 scripts

r-multiverse

multitools:Tools for Contributing Packages to R-multiverse

'R-multiverse' is a community-curated collection of R package releases, powered by 'R-universe'. The 'multitools' package has tools for maintainers of packages in 'R-multiverse'.

Maintained by William Michael Landau. Last updated 10 months ago.

4.8 match 3 stars 3.13 score

momx

Momocs:Morphometrics using R

The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.

Maintained by Vincent Bonhomme. Last updated 1 years ago.

morphometrics

2.0 match 51 stars 7.42 score 346 scripts

ms609

TreeSearch:Phylogenetic Analysis with Discrete Character Data

Reconstruct phylogenetic trees from discrete data. Inapplicable character states are handled using the algorithm of Brazeau, Guillerme and Smith (2019) <doi:10.1093/sysbio/syy083> with the "Morphy" library, under equal or implied step weights. Contains a "shiny" user interface for interactive tree search and exploration of results, including character visualization, rogue taxon detection, tree space mapping, and cluster consensus trees (Smith 2022a, b) <doi:10.1093/sysbio/syab099>, <doi:10.1093/sysbio/syab100>. Profile Parsimony (Faith and Trueman, 2001) <doi:10.1080/10635150118627>, Successive Approximations (Farris, 1969) <doi:10.2307/2412182> and custom optimality criteria are implemented.

Maintained by Martin R. Smith. Last updated 3 days ago.

bioinformatics morphological-analysis phylogenetics research-tool tree-search cpp

1.9 match 7 stars 7.89 score 51 scripts

tommyjones

tidylda:Latent Dirichlet Allocation Using 'tidyverse' Conventions

Implements an algorithm for Latent Dirichlet Allocation (LDA), Blei et at. (2003) <https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf>, using style conventions from the 'tidyverse', Wickham et al. (2019)<doi:10.21105/joss.01686>, and 'tidymodels', Kuhn et al.<https://tidymodels.github.io/model-implementation-principles/>. Fitting is done via collapsed Gibbs sampling. Also implements several novel features for LDA such as guided models and transfer learning.

Maintained by Tommy Jones. Last updated 2 months ago.

cpp openmp

2.0 match 41 stars 7.36 score 53 scripts

bioc

variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments

Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.

Maintained by Gabriel E. Hoffman. Last updated 2 months ago.

rnaseq geneexpression genesetenrichment differentialexpression batcheffect qualitycontrol regression epigenetics functionalgenomics transcriptomics normalization preprocessing microarray immunooncology software

1.3 match 7 stars 11.69 score 1.1k scripts 3 dependents

ropensci

osmextract:Download and Import Open Street Map Data Extracts

Match, download, convert and import Open Street Map data extracts obtained from several providers.

Maintained by Andrea Gilardi. Last updated 2 months ago.

geo geofabrik-zone open-data osm osm-pbf

1.5 match 173 stars 9.73 score 342 scripts

waldronlab

SingleCellMultiModal:Integrating Multi-modal Single Cell Experiment datasets

SingleCellMultiModal is an ExperimentHub package that serves multiple datasets obtained from GEO and other sources and represents them as MultiAssayExperiment objects. We provide several multi-modal datasets including scNMT, 10X Multiome, seqFISH, CITEseq, SCoPE2, and others. The scope of the package is is to provide data for benchmarking and analysis. To cite, use the 'citation' function and see <https://doi.org/10.1371/journal.pcbi.1011324>.

Maintained by Marcel Ramos. Last updated 4 months ago.

experimentdata singlecelldata reproducibleresearch experimenthub geo bioconductor-package u24ca289073

2.0 match 17 stars 7.29 score 60 scripts

bioc

parglms:support for parallelized estimation of GLMs/GEEs

This package provides support for parallelized estimation of GLMs/GEEs, catering for dispersed data.

Maintained by VJ Carey. Last updated 5 months ago.

4.4 match 3.30 score 3 scripts