Showing 200 of total 643 results (show query)
openbiox
contribution:A Tiny Contribution Table Generator Based on 'ggplot2'
Contribution table for credit assignment based on 'ggplot2'. This can improve the author contribution information in academic journals and personal CV.
Maintained by Shixiang Wang. Last updated 2 years ago.
contributioncreditggplot2research
74.4 match 11 stars 5.20 score 29 scriptskjhealy
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.
Maintained by Kieran Healy. Last updated 11 months ago.
120.7 match 2.28 score 38 scriptsbraverock
PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis
Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.
Maintained by Brian G. Peterson. Last updated 3 months ago.
14.9 match 222 stars 15.93 score 4.8k scripts 20 dependentsarnaudgallou
plume:A Simple Author Handler for Scientific Writing
Handles and formats author information in scientific writing in 'R Markdown' and 'Quarto'. 'plume' provides easy-to-use and flexible tools for injecting author metadata in 'YAML' headers as well as generating author and contribution lists (among others) as strings from tabular data.
Maintained by Arnaud Gallou. Last updated 30 days ago.
authorscontributioncontributionslistlistsmarkdownpaperpreprintquartoroleroles
22.5 match 21 stars 6.84 score 15 scriptsnicholasjclark
mvgam:Multivariate (Dynamic) Generalized Additive Models
Fit Bayesian Dynamic Generalized Additive Models to multivariate observations. Users can build nonlinear State-Space models that can incorporate semiparametric effects in observation and process components, using a wide range of observation families. Estimation is performed using Markov Chain Monte Carlo with Hamiltonian Monte Carlo in the software 'Stan'. References: Clark & Wells (2023) <doi:10.1111/2041-210X.13974>.
Maintained by Nicholas J Clark. Last updated 1 days ago.
bayesian-statisticsdynamic-factor-modelsecological-modellingforecastinggaussian-processgeneralised-additive-modelsgeneralized-additive-modelsjoint-species-distribution-modellingmultilevel-modelsmultivariate-timeseriesstantime-series-analysistimeseriesvector-autoregressionvectorautoregressioncpp
12.3 match 139 stars 9.85 score 117 scriptsbioc
SPIAT:Spatial Image Analysis of Tissues
SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.
Maintained by Yuzhou Feng. Last updated 2 days ago.
biomedicalinformaticscellbiologyspatialclusteringdataimportimmunooncologyqualitycontrolsinglecellsoftwarevisualization
12.0 match 22 stars 8.59 score 69 scriptsmarberts
piar:Price Index Aggregation
Most price indexes are made with a two-step procedure, where period-over-period elemental indexes are first calculated for a collection of elemental aggregates at each point in time, and then aggregated according to a price index aggregation structure. These indexes can then be chained together to form a time series that gives the evolution of prices with respect to a fixed base period. This package contains a collection of functions that revolve around this work flow, making it easy to build standard price indexes, and implement the methods described by Balk (2008, <doi:10.1017/CBO9780511720758>), von der Lippe (2007, <doi:10.3726/978-3-653-01120-3>), and the CPI manual (2020, <doi:10.5089/9781484354841.069>) for bilateral price indexes.
Maintained by Steve Martin. Last updated 14 days ago.
economicsinflationofficial-statisticsstatistics
11.4 match 4 stars 7.32 score 25 scriptsropensci
git2r:Provides Access to Git Repositories
Interface to the 'libgit2' library, which is a pure C implementation of the 'Git' core methods. Provides access to 'Git' repositories to extract data and running some basic 'Git' commands.
Maintained by Stefan Widgren. Last updated 11 days ago.
gitgit-clientlibgit2libgit2-library
6.0 match 218 stars 13.86 score 836 scripts 49 dependentsgbradburd
conStruct:Models Spatially Continuous and Discrete Population Genetic Structure
A method for modeling genetic data as a combination of discrete layers, within each of which relatedness may decay continuously with geographic distance. This package contains code for running analyses (which are implemented in the modeling language 'rstan') and visualizing and interpreting output. See the paper for more details on the model and its utility.
Maintained by Gideon Bradburd. Last updated 1 years ago.
8.0 match 35 stars 8.39 score 70 scriptscran
BAT:Biodiversity Assessment Tools
Includes algorithms to assess alpha and beta diversity in all their dimensions (taxonomic, phylogenetic and functional). It allows performing a number of analyses based on species identities/abundances, phylogenetic/functional distances, trees, convex-hulls or kernel density n-dimensional hypervolumes depicting species relationships. Cardoso et al. (2015) <doi:10.1111/2041-210X.12310>.
Maintained by Pedro Cardoso. Last updated 1 years ago.
20.5 match 3.17 score 3 dependentsvegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 16 days ago.
ecological-modellingecologyordinationfortranopenblas
3.3 match 472 stars 19.41 score 15k scripts 440 dependentstrivialfis
xgboost:Extreme Gradient Boosting
Extreme Gradient Boosting, which is an efficient implementation of the gradient boosting framework from Chen & Guestrin (2016) <doi:10.1145/2939672.2939785>. This package is its R interface. The package includes efficient linear model solver and tree learning algorithms. The package can automatically do parallel computation on a single machine which could be more than 10 times faster than existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is made to be extensible, so that users are also allowed to define their own objectives easily.
Maintained by Jiaming Yuan. Last updated 8 months ago.
5.4 match 6 stars 11.70 score 13k scripts 112 dependentsbaumer-lab
fec16:Data Package for the 2016 United States Federal Elections
Easily analyze relational data from the United States 2016 federal election cycle as reported by the Federal Election Commission. This package contains data about candidates, committees, and a variety of different financial expenditures. Data is from <https://www.fec.gov/data/browse-data/?tab=bulk-data>.
Maintained by Marium Tapal. Last updated 2 years ago.
11.6 match 2 stars 5.15 score 47 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
7.3 match 3 stars 8.20 score 7.8k scripts 11 dependentsnicolas-robette
GDAtools:Geometric Data Analysis
Many tools for Geometric Data Analysis (Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0>), such as MCA variants (Specific Multiple Correspondence Analysis, Class Specific Analysis), many graphical and statistical aids to interpretation (structuring factors, concentration ellipses, inductive tests, bootstrap validation, etc.) and multiple-table analysis (Multiple Factor Analysis, between- and inter-class analysis, Principal Component Analysis and Correspondence Analysis with Instrumental Variables, etc.).
Maintained by Nicolas Robette. Last updated 10 months ago.
10.1 match 10 stars 5.93 score 94 scripts 2 dependentsropensci
charlatan:Make Fake Data
Make fake data that looks realistic, supporting addresses, person names, dates, times, colors, coordinates, currencies, digital object identifiers ('DOIs'), jobs, phone numbers, 'DNA' sequences, doubles and integers from distributions and within a range.
Maintained by Roel M. Hogervorst. Last updated 1 months ago.
datadatasetfake-datafakerpeer-reviewed
5.8 match 296 stars 10.06 score 180 scripts 1 dependentspalaeoverse
palaeoverse:Prepare and Explore Data for Palaeobiological Analyses
Provides functionality to support data preparation and exploration for palaeobiological analyses, improving code reproducibility and accessibility. The wider aim of 'palaeoverse' is to bring the palaeobiological community together to establish agreed standards. The package currently includes functionality for data cleaning, binning (time and space), exploration, summarisation and visualisation. Reference datasets (i.e. Geological Time Scales <https://stratigraphy.org/chart>) and auxiliary functions are also provided. Details can be found in: Jones et al., (2023) <doi: 10.1111/2041-210X.14099>.
Maintained by Lewis A. Jones. Last updated 5 months ago.
biodiversityfossilpalaeobiologypaleobiology
6.6 match 21 stars 8.57 score 44 scripts 1 dependentsbioc
DropletUtils:Utilities for Handling Single-Cell Droplet Data
Provides a number of utility functions for handling single-cell (RNA-seq) data from droplet technologies such as 10X Genomics. This includes data loading from count matrices or molecule information files, identification of cells from empty droplets, removal of barcode-swapped pseudo-cells, and downsampling of the count matrix.
Maintained by Jonathan Griffiths. Last updated 3 months ago.
immunooncologysinglecellsequencingrnaseqgeneexpressiontranscriptomicsdataimportcoveragezlibcpp
5.6 match 10.08 score 2.7k scripts 9 dependentsrsquaredacademy
olsrr:Tools for Building OLS Regression Models
Tools designed to make it easier for users, particularly beginner/intermediate R users to build ordinary least squares regression models. Includes comprehensive regression output, heteroskedasticity tests, collinearity diagnostics, residual diagnostics, measures of influence, model fit assessment and variable selection procedures.
Maintained by Aravind Hebbali. Last updated 4 months ago.
collinearity-diagnosticslinear-modelsregressionstepwise-regression
4.4 match 103 stars 12.19 score 1.4k scripts 4 dependentsbest-practice-and-impact
aftables:Create Spreadsheet Publications Following Best Practice
Generate spreadsheet publications that follow best practice guidance from the UK government's Analysis Function, available at <https://analysisfunction.civilservice.gov.uk/policy-store/releasing-statistics-in-spreadsheets/>, with a focus on accessibility. See also the 'Python' package 'gptables'.
Maintained by Olivia Box Power. Last updated 24 days ago.
accessibilityopenxlsxreproducible-analytical-pipelinespreadsheetuk-gov-data-science
8.0 match 44 stars 6.72 score 4 scriptsrsoc
soc.ca:Specific Correspondence Analysis for the Social Sciences
Specific and class specific multiple correspondence analysis on survey-like data. Soc.ca is optimized to the needs of the social scientist and presents easily interpretable results in near publication ready quality.
Maintained by Anton Grau Larsen. Last updated 1 years ago.
12.8 match 14 stars 4.15 score 50 scriptsnschiett
fishualize:Color Palettes Based on Fish Species
Implementation of color palettes based on fish species.
Maintained by Nina M. D. Schiettekatte. Last updated 11 months ago.
6.2 match 155 stars 8.54 score 370 scriptsbioc
tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles
This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp
6.9 match 19 stars 7.26 score 35 scriptsepiverse-trace
epiparameter:Classes and Helper Functions for Working with Epidemiological Parameters
Classes and helper functions for loading, extracting, converting, manipulating, plotting and aggregating epidemiological parameters for infectious diseases. Epidemiological parameters extracted from the literature are loaded from the 'epiparameterDB' R package.
Maintained by Joshua W. Lambert. Last updated 2 months ago.
data-accessdata-packageepidemiologyepiverseprobability-distribution
4.8 match 33 stars 9.84 score 102 scripts 1 dependentsklmr
box:Write Reusable, Composable and Modular R Code
A modern module system for R. Organise code into hierarchical, composable, reusable modules, and use it effortlessly across projects via a flexible, declarative dependency loading syntax.
Maintained by Konrad Rudolph. Last updated 12 days ago.
3.8 match 888 stars 12.39 score 47 scripts 4 dependentslleisong
itsdm:Isolation Forest-Based Presence-Only Species Distribution Modeling
Collection of R functions to do purely presence-only species distribution modeling with isolation forest (iForest) and its variations such as Extended isolation forest and SCiForest. See the details of these methods in references: Liu, F.T., Ting, K.M. and Zhou, Z.H. (2008) <doi:10.1109/ICDM.2008.17>, Hariri, S., Kind, M.C. and Brunner, R.J. (2019) <doi:10.1109/TKDE.2019.2947676>, Liu, F.T., Ting, K.M. and Zhou, Z.H. (2010) <doi:10.1007/978-3-642-15883-4_18>, Guha, S., Mishra, N., Roy, G. and Schrijvers, O. (2016) <https://proceedings.mlr.press/v48/guha16.html>, Cortes, D. (2021) <arXiv:2110.13402>. Additionally, Shapley values are used to explain model inputs and outputs. See details in references: Shapley, L.S. (1953) <doi:10.1515/9781400881970-018>, Lundberg, S.M. and Lee, S.I. (2017) <https://dl.acm.org/doi/abs/10.5555/3295222.3295230>, Molnar, C. (2020) <ISBN:978-0-244-76852-2>, ล trumbelj, E. and Kononenko, I. (2014) <doi:10.1007/s10115-013-0679-x>. itsdm also provides functions to diagnose variable response, analyze variable importance, draw spatial dependence of variables and examine variable contribution. As utilities, the package includes a few functions to download bioclimatic variables including 'WorldClim' version 2.0 (see Fick, S.E. and Hijmans, R.J. (2017) <doi:10.1002/joc.5086>) and 'CMCC-BioClimInd' (see Noce, S., Caporaso, L. and Santini, M. (2020) <doi:10.1038/s41597-020-00726-5>.
Maintained by Lei Song. Last updated 2 years ago.
isolation-forestoutlier-detectionpresence-onlymodelshapley-valuespecies-distribution-modelling
8.0 match 4 stars 5.59 score 65 scriptsvsousa
poolHelper:Simulates Pooled Sequencing Genetic Data
Simulates pooled sequencing data under a variety of conditions. Also allows for the evaluation of the average absolute difference between allele frequencies computed from genotypes and those computed from pooled data. Carvalho et al., (2022) <doi:10.1101/2023.01.20.524733>.
Maintained by Joรฃo Carvalho. Last updated 2 years ago.
10.7 match 4.18 score 3 scripts 1 dependentsrkoenker
quantreg:Quantile Regression
Estimation and inference methods for models for conditional quantile functions: Linear and nonlinear parametric and non-parametric (total variation penalized) models for conditional quantiles of a univariate response and several methods for handling censored survival data. Portfolio selection methods based on expected shortfall risk are also now included. See Koenker, R. (2005) Quantile Regression, Cambridge U. Press, <doi:10.1017/CBO9780511754098> and Koenker, R. et al. (2017) Handbook of Quantile Regression, CRC Press, <doi:10.1201/9781315120256>.
Maintained by Roger Koenker. Last updated 6 days ago.
3.2 match 18 stars 13.93 score 2.6k scripts 1.5k dependentscenterforassessment
SGP:Student Growth Percentiles & Percentile Growth Trajectories
An analytic framework for the calculation of norm- and criterion-referenced academic growth estimates using large scale, longitudinal education assessment data as developed in Betebenner (2009) <doi:10.1111/j.1745-3992.2009.00161.x>.
Maintained by Damian W. Betebenner. Last updated 2 months ago.
percentile-growth-projectionsquantile-regressionsgpsgp-analysesstudent-growth-percentilesstudent-growth-projections
4.5 match 20 stars 9.69 score 88 scriptsblasbenito
distantia:Advanced Toolset for Efficient Time Series Dissimilarity Analysis
Fast C++ implementation of Dynamic Time Warping for time series dissimilarity analysis, with applications in environmental monitoring and sensor data analysis, climate science, signal processing and pattern recognition, and financial data analysis. Built upon the ideas presented in Benito and Birks (2020) <doi:10.1111/ecog.04895>, provides tools for analyzing time series of varying lengths and structures, including irregular multivariate time series. Key features include individual variable contribution analysis, restricted permutation tests for statistical significance, and imputation of missing data via GAMs. Additionally, the package provides an ample set of tools to prepare and manage time series data.
Maintained by Blas M. Benito. Last updated 25 days ago.
dissimilaritydynamic-time-warpinglock-steptime-seriescpp
7.2 match 23 stars 5.76 score 11 scriptsgiscience
ohsome:An 'ohsome API' Client
A client that grants access to the power of the 'ohsome API' from R. It lets you analyze the rich data source of the 'OpenStreetMap (OSM)' history. You can retrieve the geometry of 'OSM' data at specific points in time, and you can get aggregated statistics on the evolution of 'OSM' elements and specify your own temporal, spatial and/or thematic filters.
Maintained by Oliver Fritz. Last updated 2 years ago.
heigitohsomeopenstreetmapopenstreetmap-dataopenstreetmap-historyosmosm-data
8.2 match 11 stars 5.04 score 9 scriptspharmaverse
datacutr:SDTM Datacut
Supports the process of applying a cut to Standard Data Tabulation Model (SDTM), as part of the analysis of specific points in time of the data, normally as part of investigation into clinical trials. The functions support different approaches of cutting to the different domains of SDTM normally observed.
Maintained by Tim Barnett. Last updated 1 months ago.
5.3 match 14 stars 7.48 score 11 scriptsgavinmdouglas
FuncDiv:Compute Contributional Diversity Metrics
Compute alpha and beta contributional diversity metrics, which is intended for linking taxonomic and functional microbiome data. See 'GitHub' repository for the tutorial: <https://github.com/gavinmdouglas/FuncDiv/wiki>. Citation: Gavin M. Douglas, Sunu Kim, Morgan G. I. Langille, B. Jesse Shapiro (2023) <doi:10.1093/bioinformatics/btac809>.
Maintained by Gavin Douglas. Last updated 2 years ago.
10.5 match 10 stars 3.70 score 1 scriptsinlabru-org
inlabru:Bayesian Latent Gaussian Modelling using INLA and Extensions
Facilitates spatial and general latent Gaussian modeling using integrated nested Laplace approximation via the INLA package (<https://www.r-inla.org>). Additionally, extends the GAM-like model class to more general nonlinear predictor expressions, and implements a log Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data. Model components are specified with general inputs and mapping methods to the latent variables, and the predictors are specified via general R expressions, with separate expressions for each observation likelihood model in multi-likelihood models. A prediction method based on fast Monte Carlo sampling allows posterior prediction of general expressions of the latent variables. Ecology-focused introduction in Bachl, Lindgren, Borchers, and Illian (2019) <doi:10.1111/2041-210X.13168>.
Maintained by Finn Lindgren. Last updated 3 days ago.
3.0 match 96 stars 12.62 score 832 scripts 6 dependentsecospat
ecospat:Spatial Ecology Miscellaneous Methods
Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.
Maintained by Olivier Broennimann. Last updated 1 months ago.
4.0 match 32 stars 9.35 score 418 scripts 1 dependentsbiodiverse
unmarked:Models for Data from Unmarked Animals
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 12 hours ago.
2.9 match 4 stars 13.03 score 652 scripts 12 dependentsrudeboybert
fivethirtyeight:Data and Code Behind the Stories and Interactives at 'FiveThirtyEight'
Datasets and code published by the data journalism website 'FiveThirtyEight' available at <https://github.com/fivethirtyeight/data>. Note that while we received guidance from editors at 'FiveThirtyEight', this package is not officially published by 'FiveThirtyEight'.
Maintained by Albert Y. Kim. Last updated 2 years ago.
data-sciencedatajournalismfivethirtyeightstatistics
3.3 match 453 stars 10.98 score 1.7k scriptsbioc
ChromSCape:Analysis of single-cell epigenomics datasets with a Shiny App
ChromSCape - Chromatin landscape profiling for Single Cells - is a ready-to-launch user-friendly Shiny Application for the analysis of single-cell epigenomics datasets (scChIP-seq, scATAC-seq, scCUT&Tag, ...) from aligned data to differential analysis & gene set enrichment analysis. It is highly interactive, enables users to save their analysis and covers a wide range of analytical steps: QC, preprocessing, filtering, batch correction, dimensionality reduction, vizualisation, clustering, differential analysis and gene set analysis.
Maintained by Pacome Prompsy. Last updated 5 months ago.
shinyappssoftwaresinglecellchipseqatacseqmethylseqclassificationclusteringepigeneticsprincipalcomponentannotationbatcheffectmultiplecomparisonnormalizationpathwayspreprocessingqualitycontrolreportwritingvisualizationgenesetenrichmentdifferentialpeakcallingepigenomicsshinysingle-cellcpp
6.2 match 14 stars 5.83 score 16 scriptsjcrodriguez1989
rco:The R Code Optimizer
Automatically apply different strategies to optimize R code. 'rco' functions take R code as input, and returns R code as output.
Maintained by Juan Cruz Rodriguez. Last updated 4 months ago.
compilerfastgcchpcoptimizationoptimizer
5.3 match 82 stars 6.73 scoredgerlanc
portfolio:Analysing Equity Portfolios
Classes for analysing and implementing equity portfolios, including routines for generating tradelists and calculating exposures to user-specified risk factors.
Maintained by Daniel Gerlanc. Last updated 7 months ago.
financeportfolio-constructionrisk-modelling
5.3 match 15 stars 6.68 score 106 scriptsbioc
limpca:An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods
This package has for objectives to provide a method to make Linear Models for high-dimensional designed data. limpca applies a GLM (General Linear Model) version of ASCA and APCA to analyse multivariate sample profiles generated by an experimental design. ASCA/APCA provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design and contrarily to MANOVA, it can deal with mutlivariate datasets having more variables than observations. This method can handle unbalanced design.
Maintained by Manon Martin. Last updated 5 months ago.
statisticalmethodprincipalcomponentregressionvisualizationexperimentaldesignmultiplecomparisongeneexpressionmetabolomics
6.0 match 2 stars 5.73 score 2 scriptsmoderndive
moderndive:Tidyverse-Friendly Introductory Linear Regression
Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.
Maintained by Albert Y. Kim. Last updated 3 months ago.
3.0 match 88 stars 11.35 score 1.8k scriptsr-lib
lintr:A 'Linter' for R Code
Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.
Maintained by Michael Chirico. Last updated 8 days ago.
2.0 match 1.2k stars 17.00 score 916 scripts 33 dependentsr-lidar
lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications
Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.
Maintained by Jean-Romain Roussel. Last updated 1 months ago.
alsforestrylaslazlidarpoint-cloudremote-sensingopenblascppopenmp
2.3 match 623 stars 14.47 score 844 scripts 8 dependentsmarberts
gpindex:Generalized Price and Quantity Indexes
Tools to build and work with bilateral generalized-mean price indexes (and by extension quantity indexes), and indexes composed of generalized-mean indexes (e.g., superlative quadratic-mean indexes, GEKS). Covers the core mathematical machinery for making bilateral price indexes, computing price relatives, detecting outliers, and decomposing indexes, with wrappers for all common (and many uncommon) index-number formulas. Implements and extends many of the methods in Balk (2008, <doi:10.1017/CBO9780511720758>), von der Lippe (2007, <doi:10.3726/978-3-653-01120-3>), and the CPI manual (2020, <doi:10.5089/9781484354841.069>).
Maintained by Steve Martin. Last updated 2 days ago.
economicsinflationofficial-statisticsstatistics
5.0 match 7 stars 6.63 score 29 scripts 1 dependentsmayer79
flashlight:Shed Light on Black Box Machine Learning Models
Shed light on black box machine learning models by the help of model performance, variable importance, global surrogate models, ICE profiles, partial dependence (Friedman J. H. (2001) <doi:10.1214/aos/1013203451>), accumulated local effects (Apley D. W. (2016) <arXiv:1612.08468>), further effects plots, interaction strength, and variable contribution breakdown (Gosiewska and Biecek (2019) <arxiv:1903.11420>). All tools are implemented to work with case weights and allow for stratified analysis. Furthermore, multiple flashlights can be combined and analyzed together.
Maintained by Michael Mayer. Last updated 8 months ago.
interpretabilityinterpretable-machine-learningmachine-learningxai
5.3 match 22 stars 6.25 score 54 scripts 1 dependentsadrientaudiere
MiscMetabar:Miscellaneous Functions for Metabarcoding Analysis
Facilitate the description, transformation, exploration, and reproducibility of metabarcoding analyses. 'MiscMetabar' is mainly built on top of the 'phyloseq', 'dada2' and 'targets' R packages. It helps to build reproducible and robust bioinformatics pipelines in R. 'MiscMetabar' makes ecological analysis of alpha and beta-diversity easier, more reproducible and more powerful by integrating a large number of tools. Important features are described in Taudiรจre A. (2023) <doi:10.21105/joss.06038>.
Maintained by Adrien Taudiรจre. Last updated 25 days ago.
sequencingmicrobiomemetagenomicsclusteringclassificationvisualizationampliconamplicon-sequencingbiodiversity-informaticsecologyilluminametabarcodingngs-analysis
5.1 match 17 stars 6.44 score 23 scriptsdietrichson
ProPublicaR:Access Functions for ProPublica's APIs
Provides wrapper functions to access the ProPublica's Congress and Campaign Finance APIs. The Congress API provides near real-time access to legislative data from the House of Representatives, the Senate and the Library of Congress. The Campaign Finance API provides data from United States Federal Election Commission filings and other sources. The API covers summary information for candidates and committees, as well as certain types of itemized data. For more information about these APIs go to: <https://www.propublica.org/datastore/apis>.
Maintained by Aleksander Dietrichson. Last updated 2 years ago.
7.4 match 12 stars 4.38 score 1 scriptsnsaph-software
CRE:Interpretable Discovery and Inference of Heterogeneous Treatment Effects
Provides a new method for interpretable heterogeneous treatment effects characterization in terms of decision rules via an extensive exploration of heterogeneity patterns by an ensemble-of-trees approach, enforcing high stability in the discovery. It relies on a two-stage pseudo-outcome regression, and it is supported by theoretical convergence guarantees. Bargagli-Stoffi, F. J., Cadei, R., Lee, K., & Dominici, F. (2023) Causal rule ensemble: Interpretable Discovery and Inference of Heterogeneous Treatment Effects. arXiv preprint <doi:10.48550/arXiv.2009.09036>.
Maintained by Falco Joannes Bargagli Stoffi. Last updated 5 months ago.
5.0 match 13 stars 6.41 score 11 scriptsropensci
rix:Reproducible Data Science Environments with 'Nix'
Simplifies the creation of reproducible data science environments using the 'Nix' package manager, as described in Dolstra (2006) <ISBN 90-393-4130-3>. The included `rix()` function generates a complete description of the environment as a `default.nix` file, which can then be built using 'Nix'. This results in project specific software environments with pinned versions of R, packages, linked system dependencies, and other tools. Additional helpers make it easy to run R code in 'Nix' software environments for testing and production.
Maintained by Bruno Rodrigues. Last updated 4 days ago.
nixpeer-reviewedreproducibilityreproducible-research
3.0 match 235 stars 10.54 score 67 scriptsgabrielodom
mvMonitoring:Multi-State Adaptive Dynamic Principal Component Analysis for Multivariate Process Monitoring
Use multi-state splitting to apply Adaptive-Dynamic PCA (ADPCA) to data generated from a continuous-time multivariate industrial or natural process. Employ PCA-based dimension reduction to extract linear combinations of relevant features, reducing computational burdens. For a description of ADPCA, see <doi:10.1007/s00477-016-1246-2>, the 2016 paper from Kazor et al. The multi-state application of ADPCA is from a manuscript under current revision entitled "Multi-State Multivariate Statistical Process Control" by Odom, Newhart, Cath, and Hering, and is expected to appear in Q1 of 2018.
Maintained by Gabriel Odom. Last updated 1 years ago.
5.9 match 4 stars 5.24 score 29 scriptsrstudio
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
3.5 match 54 stars 8.63 score 221 scripts 3 dependentsr-lib
vctrs:Vector Helpers
Defines new notions of prototype and size that are used to provide tools for consistent and well-founded type-coercion and size-recycling, and are in turn connected to ideas of type- and size-stability useful for analysing function interfaces.
Maintained by Davis Vaughan. Last updated 5 months ago.
1.6 match 290 stars 18.97 score 1.1k scripts 13k dependentsbioc
GSVA:Gene Set Variation Analysis for Microarray and RNA-Seq Data
Gene Set Variation Analysis (GSVA) is a non-parametric, unsupervised method for estimating variation of gene set enrichment through the samples of a expression data set. GSVA performs a change in coordinate systems, transforming the data from a gene by sample matrix to a gene-set by sample matrix, thereby allowing the evaluation of pathway enrichment for each sample. This new matrix of GSVA enrichment scores facilitates applying standard analytical methods like functional enrichment, survival analysis, clustering, CNV-pathway analysis or cross-tissue pathway analysis, in a pathway-centric manner.
Maintained by Robert Castelo. Last updated 4 days ago.
functionalgenomicsmicroarrayrnaseqpathwaysgenesetenrichmentgene-set-enrichmentgenomicspathway-enrichment-analysis
2.0 match 210 stars 14.72 score 1.6k scripts 19 dependentsbioc
iSEEhub:iSEE for the Bioconductor ExperimentHub
This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.
Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.
dataimportimmunooncology infrastructureshinyappssinglecellsoftwarebioconductorbioconductor-packagehacktoberfestisee
5.3 match 3 stars 5.56 score 4 scriptssamhforbes
eyetrackingR:Eye-Tracking Data Analysis
Addresses tasks along the pipeline from raw data to analysis and visualization for eye-tracking data. Offers several popular types of analyses, including linear and growth curve time analyses, onset-contingent reaction time analyses, as well as several non-parametric bootstrapping approaches. For references to the approach see Mirman, Dixon & Magnuson (2008) <doi:10.1016/j.jml.2007.11.006>, and Barr (2008) <doi:10.1016/j.jml.2007.09.002>.
Maintained by Samuel Forbes. Last updated 1 years ago.
3.6 match 22 stars 7.84 score 60 scriptssvkucheryavski
mdatools:Multivariate Data Analysis for Chemometrics
Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.
Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.
3.9 match 35 stars 7.37 score 220 scripts 1 dependentsquantsulting
ghyp:Generalized Hyperbolic Distribution and Its Special Cases
Detailed functionality for working with the univariate and multivariate Generalized Hyperbolic distribution and its special cases (Hyperbolic (hyp), Normal Inverse Gaussian (NIG), Variance Gamma (VG), skewed Student-t and Gaussian distribution). Especially, it contains fitting procedures, an AIC-based model selection routine, and functions for the computation of density, quantile, probability, random variates, expected shortfall and some portfolio optimization and plotting routines as well as the likelihood ratio test. In addition, it contains the Generalized Inverse Gaussian distribution. See Chapter 3 of A. J. McNeil, R. Frey, and P. Embrechts. Quantitative risk management: Concepts, techniques and tools. Princeton University Press, Princeton (2005).
Maintained by Marc Weibel. Last updated 7 months ago.
5.0 match 5.58 score 90 scripts 8 dependentsharrelfe
Hmisc:Harrell Miscellaneous
Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.
Maintained by Frank E Harrell Jr. Last updated 2 days ago.
1.6 match 210 stars 17.61 score 17k scripts 750 dependentskof-ch
tstools:A Time Series Toolbox for Official Statistics
Plot official statistics' time series conveniently: automatic legends, highlight windows, stacked bar chars with positive and negative contributions, sum-as-line option, two y-axes with automatic horizontal grids that fit both axes and other popular chart types. 'tstools' comes with a plethora of defaults to let you plot without setting an abundance of parameters first, but gives you the flexibility to tweak the defaults. In addition to charts, 'tstools' provides a super fast, 'data.table' backed time series I/O that allows the user to export / import long format, wide format and transposed wide format data to various file types.
Maintained by Stรฉphane Bisinger. Last updated 1 years ago.
4.3 match 11 stars 6.47 score 177 scriptsguido-s
netmeta:Network Meta-Analysis using Frequentist Methods
A comprehensive set of functions providing frequentist methods for network meta-analysis (Balduzzi et al., 2023) <doi:10.18637/jss.v106.i02> and supporting Schwarzer et al. (2015) <doi:10.1007/978-3-319-21416-0>, Chapter 8 "Network Meta-Analysis": - frequentist network meta-analysis following Rรผcker (2012) <doi:10.1002/jrsm.1058>; - additive network meta-analysis for combinations of treatments (Rรผcker et al., 2020) <doi:10.1002/bimj.201800167>; - network meta-analysis of binary data using the Mantel-Haenszel or non-central hypergeometric distribution method (Efthimiou et al., 2019) <doi:10.1002/sim.8158>, or penalised logistic regression (Evrenoglou et al., 2022) <doi:10.1002/sim.9562>; - rankograms and ranking of treatments by the Surface under the cumulative ranking curve (SUCRA) (Salanti et al., 2013) <doi:10.1016/j.jclinepi.2010.03.016>; - ranking of treatments using P-scores (frequentist analogue of SUCRAs without resampling) according to Rรผcker & Schwarzer (2015) <doi:10.1186/s12874-015-0060-8>; - split direct and indirect evidence to check consistency (Dias et al., 2010) <doi:10.1002/sim.3767>, (Efthimiou et al., 2019) <doi:10.1002/sim.8158>; - league table with network meta-analysis results; - 'comparison-adjusted' funnel plot (Chaimani & Salanti, 2012) <doi:10.1002/jrsm.57>; - net heat plot and design-based decomposition of Cochran's Q according to Krahn et al. (2013) <doi:10.1186/1471-2288-13-35>; - measures characterizing the flow of evidence between two treatments by Kรถnig et al. (2013) <doi:10.1002/sim.6001>; - automated drawing of network graphs described in Rรผcker & Schwarzer (2016) <doi:10.1002/jrsm.1143>; - partial order of treatment rankings ('poset') and Hasse diagram for 'poset' (Carlsen & Bruggemann, 2014) <doi:10.1002/cem.2569>; (Rรผcker & Schwarzer, 2017) <doi:10.1002/jrsm.1270>; - contribution matrix as described in Papakonstantinou et al. (2018) <doi:10.12688/f1000research.14770.3> and Davies et al. (2022) <doi:10.1002/sim.9346>; - subgroup network meta-analysis.
Maintained by Guido Schwarzer. Last updated 1 days ago.
meta-analysisnetwork-meta-analysisrstudio
2.3 match 33 stars 11.82 score 199 scripts 10 dependentsfrbcesab
rcompendium:Create a Package or Research Compendium Structure
Makes easier the creation of R package or research compendium (i.e. a predefined files/folders structure) so that users can focus on the code/analysis instead of wasting time organizing files. A full ready-to-work structure is set up with some additional features: version control, remote repository creation, CI/CD configuration (check package integrity under several OS, test code with 'testthat', and build and deploy website using 'pkgdown'). This package heavily relies on the R packages 'devtools' and 'usethis' and follows recommendations made by Wickham H. (2015) <ISBN:9781491910597> and Marwick B. et al. (2018) <doi:10.7287/peerj.preprints.3192v2>.
Maintained by Nicolas Casajus. Last updated 1 months ago.
reproducible-researchresearch-compendium
4.0 match 40 stars 6.72 score 22 scriptssafetygraphics
safetyGraphics:Interactive Graphics for Monitoring Clinical Trial Safety
A framework for evaluation of clinical trial safety. Users can interactively explore their data using the included 'Shiny' application.
Maintained by Jeremy Wildfire. Last updated 2 years ago.
3.2 match 98 stars 8.18 score 111 scriptsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurรฉlie Siberchicot. Last updated 12 days ago.
1.8 match 39 stars 14.96 score 2.2k scripts 256 dependentsepiverse-trace
cleanepi:Clean and Standardize Epidemiological Data
Cleaning and standardizing tabular data package, tailored specifically for curating epidemiological data. It streamlines various data cleaning tasks that are typically expected when working with datasets in epidemiology. It returns the processed data in the same format, and generates a comprehensive report detailing the outcomes of each cleaning task.
Maintained by Karim Manรฉ. Last updated 2 days ago.
data-cleaningepidemiologyepiverse
3.5 match 9 stars 7.44 score 19 scriptsharrison4192
autostats:Auto Stats
Automatically do statistical exploration. Create formulas using 'tidyselect' syntax, and then determine cross-validated model accuracy and variable contributions using 'glm' and 'xgboost'. Contains additional helper functions to create and modify formulas. Has a flagship function to quickly determine relationships between categorical and continuous variables in the data set.
Maintained by Harrison Tietze. Last updated 11 days ago.
3.8 match 6 stars 6.76 score 5 scripts 2 dependentssvmiller
stevemisc:Steve's Miscellaneous Functions
These are miscellaneous functions that I find useful for my research and teaching. The contents include themes for plots, functions for simulating quantities of interest from regression models, functions for simulating various forms of fake data for instructional/research purposes, and many more. All told, the functions provided here are broadly useful for data organization, data presentation, data recoding, and data simulation.
Maintained by Steve Miller. Last updated 5 days ago.
dplyrmixed-effects-modelsmultivariate-normal-distributiontidyverse
3.8 match 10 stars 6.85 score 392 scripts 2 dependentskassambara
factoextra:Extract and Visualize the Results of Multivariate Data Analyses
Provides some easy-to-use functions to extract and visualize the output of multivariate data analyses, including 'PCA' (Principal Component Analysis), 'CA' (Correspondence Analysis), 'MCA' (Multiple Correspondence Analysis), 'FAMD' (Factor Analysis of Mixed Data), 'MFA' (Multiple Factor Analysis) and 'HMFA' (Hierarchical Multiple Factor Analysis) functions from different R packages. It contains also functions for simplifying some clustering analysis steps and provides 'ggplot2' - based elegant data visualization.
Maintained by Alboukadel Kassambara. Last updated 5 years ago.
1.8 match 363 stars 14.13 score 15k scripts 52 dependentsmahshaaban
pcr:Analyzing Real-Time Quantitative PCR Data
Calculates the amplification efficiency and curves from real-time quantitative PCR (Polymerase Chain Reaction) data. Estimates the relative expression from PCR data using the double delta CT and the standard curve methods Livak & Schmittgen (2001) <doi:10.1006/meth.2001.1262>. Tests for statistical significance using two-group tests and linear regression Yuan et al. (2006) <doi: 10.1186/1471-2105-7-85>.
Maintained by Mahmoud Ahmed. Last updated 8 months ago.
data-analysesmolecular-biologyqpcr
3.5 match 28 stars 7.25 score 63 scriptsdata-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 11 days ago.
2.0 match 418 stars 12.50 score 448 scripts 9 dependentspavel-fibich
gawdis:Multi-Trait Dissimilarity with more Uniform Contributions
R function gawdis() produces multi-trait dissimilarity with more uniform contributions of different traits. de Bello et al. (2021) <doi:10.1111/2041-210X.13537> presented the approach based on minimizing the differences in the correlation between the dissimilarity of each trait, or groups of traits, and the multi-trait dissimilarity. This is done using either an analytic or a numerical solution, both available in the function.
Maintained by Pavel Fibich. Last updated 2 years ago.
dissimilarityfdgowdismulti-trait-dissimilaritytrait
4.8 match 5 stars 5.20 score 21 scripts 1 dependentsrishvish
DImodelsVis:Visualising and Interpreting Statistical Models Fit to Compositional Data
Statistical models fit to compositional data are often difficult to interpret due to the sum to 1 constraint on data variables. 'DImodelsVis' provides novel visualisations tools to aid with the interpretation of models fit to compositional data. All visualisations in the package are created using the 'ggplot2' plotting framework and can be extended like every other 'ggplot' object.
Maintained by Rishabh Vishwakarma. Last updated 6 months ago.
6.7 match 3.70 score 7 scriptsbioc
maftools:Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Maintained by Anand Mayakonda. Last updated 5 months ago.
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
1.7 match 459 stars 14.63 score 948 scripts 18 dependentsropensci
stplanr:Sustainable Transport Planning
Tools for transport planning with an emphasis on spatial transport data and non-motorized modes. The package was originally developed to support the 'Propensity to Cycle Tool', a publicly available strategic cycle network planning tool (Lovelace et al. 2017) <doi:10.5198/jtlu.2016.862>, but has since been extended to support public transport routing and accessibility analysis (Moreno-Monroy et al. 2017) <doi:10.1016/j.jtrangeo.2017.08.012> and routing with locally hosted routing engines such as 'OSRM' (Lowans et al. 2023) <doi:10.1016/j.enconman.2023.117337>. The main functions are for creating and manipulating geographic "desire lines" from origin-destination (OD) data (building on the 'od' package); calculating routes on the transport network locally and via interfaces to routing services such as <https://cyclestreets.net/> (Desjardins et al. 2021) <doi:10.1007/s11116-021-10197-1>; and calculating route segment attributes such as bearing. The package implements the 'travel flow aggregration' method described in Morgan and Lovelace (2020) <doi:10.1177/2399808320942779> and the 'OD jittering' method described in Lovelace et al. (2022) <doi:10.32866/001c.33873>. Further information on the package's aim and scope can be found in the vignettes and in a paper in the R Journal (Lovelace and Ellison 2018) <doi:10.32614/RJ-2018-053>, and in a paper outlining the landscape of open source software for geographic methods in transport planning (Lovelace, 2021) <doi:10.1007/s10109-020-00342-2>.
Maintained by Robin Lovelace. Last updated 7 months ago.
cyclecyclingdesire-linesorigin-destinationpeer-reviewedpubic-transportroute-networkroutesroutingspatialtransporttransport-planningtransportationwalking
2.0 match 427 stars 12.31 score 684 scripts 3 dependentszabore
riskclustr:Functions to Study Etiologic Heterogeneity
A collection of functions related to the study of etiologic heterogeneity both across disease subtypes and across individual disease markers. The included functions allow one to quantify the extent of etiologic heterogeneity in the context of a case-control study, and provide p-values to test for etiologic heterogeneity across individual risk factors. Begg CB, Zabor EC, Bernstein JL, Bernstein L, Press MF, Seshan VE (2013) <doi:10.1002/sim.5902>.
Maintained by Emily C. Zabor. Last updated 1 years ago.
5.1 match 1 stars 4.81 score 26 scriptsdavidfirth
relimp:Relative Contribution of Effects in a Regression Model
Functions to facilitate inference on the relative importance of predictors in a linear or generalized linear model, and a couple of useful Tcl/Tk widgets.
Maintained by David Firth. Last updated 9 years ago.
4.7 match 1 stars 5.11 score 42 scripts 61 dependentsstatisticsnorway
GaussSuppression:Tabular Data Suppression using Gaussian Elimination
A statistical disclosure control tool to protect tables by suppression using the Gaussian elimination secondary suppression algorithm (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6>. A suggestion is to start by working with functions SuppressSmallCounts() and SuppressDominantCells(). These functions use primary suppression functions for the minimum frequency rule and the dominance rule, respectively. Novel functionality for suppression of disclosive cells is also included. General primary suppression functions can be supplied as input to the general working horse function, GaussSuppressionFromData(). Suppressed frequencies can be replaced by synthetic decimal numbers as described in Langsrud (2019) <doi:10.1007/s11222-018-9848-9>.
Maintained by รyvind Langsrud. Last updated 2 days ago.
3.6 match 2 stars 6.61 score 50 scriptssbgraves237
sos:Search Contributed R Packages, Sort by Package
Search contributed R packages, sort by package.
Maintained by Spencer Graves. Last updated 9 months ago.
3.5 match 2 stars 6.82 score 241 scripts 3 dependentshughparsonage
grattan:Australian Tax Policy Analysis
Utilities to cost and evaluate Australian tax policy, including fast projections of personal income tax collections, high-performance tax and transfer calculators, and an interface to common indices from the Australian Bureau of Statistics. Written to support Grattan Institute's Australian Perspectives program, and related projects. Access to the Australian Taxation Office's sample files of personal income tax returns is assumed.
Maintained by Hugh Parsonage. Last updated 12 months ago.
3.8 match 25 stars 6.34 score 124 scriptskjakobse
EpiForsk:Code Sharing at the Department of Epidemiological Research at Statens Serum Institut
This is a collection of assorted functions and examples collected from various projects. Currently we have functionalities for simplifying overlapping time intervals, Charlson comorbidity score constructors for Danish data, getting frequency for multiple variables, getting standardized output from logistic and log-linear regressions, sibling design linear regression functionalities a method for calculating the confidence intervals for functions of parameters from a GLM, Bayes equivalent for hypothesis testing with asymptotic Bayes factor, and several help functions for generalized random forest analysis using 'grf'.
Maintained by Kim Daniel Jakobsen. Last updated 1 years ago.
5.3 match 4.48 score 8 scriptsdatalorax
equatiomatic:Transform Models into 'LaTeX' Equations
The goal of 'equatiomatic' is to reduce the pain associated with writing 'LaTeX' formulas from fitted models. The primary function of the package, extract_eq(), takes a fitted model object as its input and returns the corresponding 'LaTeX' code for the model.
Maintained by Philippe Grosjean. Last updated 6 days ago.
2.0 match 619 stars 11.75 score 424 scripts 5 dependentsbiorgeo
bioregion:Comparison of Bioregionalisation Methods
The main purpose of this package is to propose a transparent methodological framework to compare bioregionalisation methods based on hierarchical and non-hierarchical clustering algorithms (Kreft & Jetz (2010) <doi:10.1111/j.1365-2699.2010.02375.x>) and network algorithms (Lenormand et al. (2019) <doi:10.1002/ece3.4718> and Leroy et al. (2019) <doi:10.1111/jbi.13674>).
Maintained by Maxime Lenormand. Last updated 10 days ago.
biogeographybioregionbioregionalizationcpp
3.7 match 7 stars 6.27 score 11 scriptshugaped
MBNMAdose:Dose-Response MBNMA Models
Fits Bayesian dose-response model-based network meta-analysis (MBNMA) that incorporate multiple doses within an agent by modelling different dose-response functions, as described by Mawdsley et al. (2016) <doi:10.1002/psp4.12091>. By modelling dose-response relationships this can connect networks of evidence that might otherwise be disconnected, and can improve precision on treatment estimates. Several common dose-response functions are provided; others may be added by the user. Various characteristics and assumptions can be flexibly added to the models, such as shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting at the treatment level.
Maintained by Hugo Pedder. Last updated 1 months ago.
3.5 match 10 stars 6.60 scorechjackson
flexsurv:Flexible Parametric Survival and Multi-State Models
Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.
Maintained by Christopher Jackson. Last updated 2 months ago.
1.7 match 57 stars 13.31 score 632 scripts 43 dependentsropensci
weatherOz:An API Client for Australian Weather and Climate Data Resources
Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
Maintained by Rodrigo Pires. Last updated 19 days ago.
dpirdbommeteorological-dataweather-forecastaustraliaweatherweather-datameteorologywestern-australiaaustralia-bureau-of-meteorologywestern-australia-agricultureaustralia-agricultureaustralia-climateaustralia-weatherapi-clientclimatedatarainfallweather-api
2.7 match 32 stars 8.54 score 40 scriptsropensci
nasapower:NASA POWER API Client
An API client for NASA POWER global meteorology, surface solar energy and climatology data API. POWER (Prediction Of Worldwide Energy Resources) data are freely available for download with varying spatial resolutions dependent on the original data and with several temporal resolutions depending on the POWER parameter and community. This work is funded through the NASA Earth Science Directorate Applied Science Program. For more on the data themselves, the methodologies used in creating, a web- based data viewer and web access, please see <https://power.larc.nasa.gov/>.
Maintained by Adam H. Sparks. Last updated 10 days ago.
nasameteorological-dataweatherglobalweather-datameteorologynasa-poweragroclimatologyearth-sciencedata-accessclimate-dataagroclimatology-dataweather-variables
2.3 match 101 stars 9.98 score 137 scripts 3 dependentstidyverse
duckplyr:A 'DuckDB'-Backed Version of 'dplyr'
A drop-in replacement for 'dplyr', powered by 'DuckDB' for performance. Offers convenient utilities for working with in-memory and larger-than-memory data while retaining full 'dplyr' compatibility.
Maintained by Kirill Mรผller. Last updated 4 days ago.
analyticsdataframedplyrduckdbperformance
2.0 match 309 stars 11.33 score 220 scriptsbioc
ensembldb:Utilities to create and use Ensembl-based annotation databases
The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.
Maintained by Johannes Rainer. Last updated 5 months ago.
geneticsannotationdatasequencingcoverageannotationbioconductorbioconductor-packagesensembl
1.6 match 35 stars 14.08 score 892 scripts 108 dependentsjonathan-g
kayadata:Kaya Identity Data for Nations and Regions
Provides data for Kaya identity variables (population, gross domestic product, primary energy consumption, and energy-related CO2 emissions) for the world and for individual nations, and utility functions for looking up data, plotting trends of Kaya variables, and plotting the fuel mix for a given country or region. The Kaya identity (Yoichi Kaya and Keiichi Yokobori, "Environment, Energy, and Economy: Strategies for Sustainability" (United Nations University Press, 1998) and <https://en.wikipedia.org/wiki/Kaya_identity>) expresses a nation's or region's greenhouse gas emissions in terms of its population, per-capita Gross Domestic Product, the energy intensity of its economy, and the carbon-intensity of its energy supply.
Maintained by Jonathan Gilligan. Last updated 8 months ago.
4.5 match 4.98 score 32 scriptshaghish
shapley:Weighted Mean SHAP and CI for Robust Feature Selection in ML Grid
This R package introduces Weighted Mean SHapley Additive exPlanations (WMSHAP), an innovative method for calculating SHAP values for a grid of fine-tuned base-learner machine learning models as well as stacked ensembles, a method not previously available due to the common reliance on single best-performing models. By integrating the weighted mean SHAP values from individual base-learners comprising the ensemble or individual base-learners in a tuning grid search, the package weights SHAP contributions according to each model's performance, assessed by multiple either R squared (for both regression and classification models). alternatively, this software also offers weighting SHAP values based on the area under the precision-recall curve (AUCPR), the area under the curve (AUC), and F2 measures for binary classifiers. It further extends this framework to implement weighted confidence intervals for weighted mean SHAP values, offering a more comprehensive and robust feature importance evaluation over a grid of machine learning models, instead of solely computing SHAP values for the best model. This methodology is particularly beneficial for addressing the severe class imbalance (class rarity) problem by providing a transparent, generalized measure of feature importance that mitigates the risk of reporting SHAP values for an overfitted or biased model and maintains robustness under severe class imbalance, where there is no universal criteria of identifying the absolute best model. Furthermore, the package implements hypothesis testing to ascertain the statistical significance of SHAP values for individual features, as well as comparative significance testing of SHAP contributions between features. Additionally, it tackles a critical gap in feature selection literature by presenting criteria for the automatic feature selection of the most important features across a grid of models or stacked ensembles, eliminating the need for arbitrary determination of the number of top features to be extracted. This utility is invaluable for researchers analyzing feature significance, particularly within severely imbalanced outcomes where conventional methods fall short. Moreover, it is also expected to report democratic feature importance across a grid of models, resulting in a more comprehensive and generalizable feature selection. The package further implements a novel method for visualizing SHAP values both at subject level and feature level as well as a plot for feature selection based on the weighted mean SHAP ratios.
Maintained by E. F. Haghish. Last updated 2 days ago.
class-imbalanceclass-imbalance-problemfeature-extractionfeature-importancefeature-selectionmachine-learningmachine-learning-algorithmsshapshap-analysisshap-valuesshapelyshapley-additive-explanationsshapley-decompositionshapley-valueshapley-valuesshapleyvalueweighted-shapweighted-shap-confidence-intervalweighted-shapleyweighted-shapley-ci
4.3 match 14 stars 5.19 score 17 scriptsflyaflya
causact:Fast, Easy, and Visual Bayesian Inference
Accelerate Bayesian analytics workflows in 'R' through interactive modelling, visualization, and inference. Define probabilistic graphical models using directed acyclic graphs (DAGs) as a unifying language for business stakeholders, statisticians, and programmers. This package relies on interfacing with the 'numpyro' python package.
Maintained by Adam Fleischhacker. Last updated 2 months ago.
bayesian-inferencedagsposterior-probabilityprobabilistic-graphical-modelsprobabilistic-programming
3.1 match 45 stars 7.15 score 52 scriptsbioc
phyloseq:Handling and analysis of high-throughput microbiome census data
phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.
Maintained by Paul J. McMurdie. Last updated 5 months ago.
immunooncologysequencingmicrobiomemetagenomicsclusteringclassificationmultiplecomparisongeneticvariability
1.6 match 597 stars 13.90 score 8.4k scripts 37 dependentssachsmc
eventglm:Regression Models for Event History Outcomes
A user friendly, easy to understand way of doing event history regression for marginal estimands of interest, including the cumulative incidence and the restricted mean survival, using the pseudo observation framework for estimation. For a review of the methodology, see Andersen and Pohar Perme (2010) <doi:10.1177/0962280209105020> or Sachs and Gabriel (2022) <doi:10.18637/jss.v102.i09>. The interface uses the well known formulation of a generalized linear model and allows for features including plotting of residuals, the use of sampling weights, and corrected variance estimation.
Maintained by Michael C Sachs. Last updated 15 days ago.
3.4 match 5 stars 6.33 score 24 scripts 1 dependentsneuropsychology
psycho:Efficient and Publishing-Oriented Workflow for Psychological Science
The main goal of the psycho package is to provide tools for psychologists, neuropsychologists and neuroscientists, to facilitate and speed up the time spent on data analysis. It aims at supporting best practices and tools to format the output of statistical methods to directly paste them into a manuscript, ensuring statistical reporting standardization and conformity.
Maintained by Dominique Makowski. Last updated 4 years ago.
apaapa6bayesiancorrelationformatinterpretationmixed-modelsneurosciencepsychopsychologyrstanarmstatistics
2.0 match 149 stars 10.86 score 628 scripts 5 dependentslarmarange
broom.helpers:Helpers for Model Coefficients Tibbles
Provides suite of functions to work with regression model 'broom::tidy()' tibbles. The suite includes functions to group regression model terms by variable, insert reference and header rows for categorical variables, add variable labels, and more.
Maintained by Joseph Larmarange. Last updated 9 days ago.
1.9 match 22 stars 11.45 score 165 scripts 2 dependentsndphillips
yarrr:A Companion to the e-Book "YaRrr!: The Pirate's Guide to R"
Contains a mixture of functions and data sets referred to in the introductory e-book "YaRrr!: The Pirate's Guide to R". The latest version of the e-book is available for free at <https://www.thepiratesguidetor.com>.
Maintained by Nathaniel Phillips. Last updated 11 months ago.
2.0 match 78 stars 10.67 score 1.2k scripts 2 dependentsasl
Rssa:A Collection of Methods for Singular Spectrum Analysis
Methods and tools for Singular Spectrum Analysis including decomposition, forecasting and gap-filling for univariate and multivariate time series. General description of the methods with many examples can be found in the book Golyandina (2018, <doi:10.1007/978-3-662-57380-8>). See 'citation("Rssa")' for details.
Maintained by Anton Korobeynikov. Last updated 6 months ago.
3.0 match 58 stars 7.10 score 182 scripts 4 dependentsqsbase
qs:Quick Serialization of R Objects
Provides functions for quickly writing and reading any R object to and from disk.
Maintained by Travers Ching. Last updated 9 days ago.
compressiondata-storageencodingserializationlibzstdlz4cpp
1.5 match 414 stars 13.91 score 2.5k scripts 51 dependentsbioc
MsCoreUtils:Core Utils for Mass Spectrometry Data
MsCoreUtils defines low-level functions for mass spectrometry data and is independent of any high-level data structures. These functions include mass spectra processing functions (noise estimation, smoothing, binning, baseline estimation), quantitative aggregation functions (median polish, robust summarisation, ...), missing data imputation, data normalisation (quantiles, vsn, ...), misc helper functions, that are used across high-level data structure within the R for Mass Spectrometry packages.
Maintained by RforMassSpectrometry Package Maintainer. Last updated 4 days ago.
infrastructureproteomicsmassspectrometrymetabolomicsbioconductormass-spectrometryutils
2.0 match 16 stars 10.52 score 41 scripts 71 dependentsr-forge
distr:Object Oriented Implementation of Distributions
S4-classes and methods for distributions.
Maintained by Peter Ruckdeschel. Last updated 2 months ago.
2.4 match 8.84 score 327 scripts 32 dependentsuclouvain-cbio
scpdata:Single-Cell Proteomics Data Package
The package disseminates mass spectrometry (MS)-based single-cell proteomics (SCP) datasets. The data were collected from published work and formatted using the `scp` data structure. The data sets contain quantitative information at spectrum, peptide and/or protein level for single cells or minute sample amounts.
Maintained by Christophe Vanderaa. Last updated 9 days ago.
experimentdataexpressiondataexperimenthubreproducibleresearchmassspectrometrydataproteomesinglecelldatapackagetypedata
3.8 match 6 stars 5.58 score 16 scriptseurostat
hicp:Harmonised Index of Consumer Prices
The Harmonised Index of Consumer Prices (HICP) is the key economic figure to measure inflation in the euro area. The methodology underlying the HICP is documented in the HICP Methodological Manual (<https://ec.europa.eu/eurostat/web/products-manuals-and-guidelines/w/ks-gq-24-003>). Based on the manual, this package provides functions to access and work with HICP data from Eurostat's public database (<https://ec.europa.eu/eurostat/data/database>).
Maintained by Sebastian Weinand. Last updated 8 months ago.
consumer-price-indexinflationpricesstatistics
4.5 match 2 stars 4.60 score 6 scriptsslwu89
MicroMoB:Discrete Time Simulation of Mosquito-Borne Pathogen Transmission
Provides a framework based on S3 dispatch for constructing models of mosquito-borne pathogen transmission which are constructed from submodels of various components (i.e. immature and adult mosquitoes, human populations). A consistent mathematical expression for the distribution of bites on hosts means that different models (stochastic, deterministic, etc.) can be coherently incorporated and updated over a discrete time step.
Maintained by Sean L. Wu. Last updated 2 years ago.
5.0 match 4.16 score 32 scriptsbraverock
PortfolioAnalytics:Portfolio Analysis, Including Numerical Methods for Optimization of Portfolios
Portfolio optimization and analysis routines and graphics.
Maintained by Brian G. Peterson. Last updated 3 months ago.
1.8 match 81 stars 11.49 score 626 scripts 2 dependentsbioc
EBImage:Image processing and analysis toolbox for R
EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.
Maintained by Andrzej Oleล. Last updated 5 months ago.
visualizationbioinformaticsimage-analysisimage-processingcpp
1.6 match 71 stars 12.89 score 1.5k scripts 33 dependentsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 28 days ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
1.8 match 55 stars 11.77 score 1.2k scripts 2 dependentsdots26
MaOEA:Many Objective Evolutionary Algorithm
A set of evolutionary algorithms to solve many-objective optimization. Hybridization between the algorithms are also facilitated. Available algorithms are: 'SMS-EMOA' <doi:10.1016/j.ejor.2006.08.008> 'NSGA-III' <doi:10.1109/TEVC.2013.2281535> 'MO-CMA-ES' <doi:10.1145/1830483.1830573> The following many-objective benchmark problems are also provided: 'DTLZ1'-'DTLZ4' from Deb, et al. (2001) <doi:10.1007/1-84628-137-7_6> and 'WFG4'-'WFG9' from Huband, et al. (2005) <doi:10.1109/TEVC.2005.861417>.
Maintained by Dani Irawan. Last updated 2 years ago.
5.4 match 6 stars 3.78 score 2 scriptshofnerb
papeR:A Toolbox for Writing Pretty Papers and Reports
A toolbox for writing 'knitr', 'Sweave' or other 'LaTeX'- or 'markdown'-based reports and to prettify the output of various estimated models.
Maintained by Benjamin Hofner. Last updated 4 years ago.
knitrlatexr-languagereportingreproduciblereproducible-researchsweave
2.8 match 30 stars 7.30 score 223 scripts 1 dependentskviswana
ezEDA:Task Oriented Interface for Exploratory Data Analysis
Enables users to create visualizations using functions based on the data analysis task rather than on plotting mechanics. It hides the details of the individual 'ggplot2' function calls and allows the user to focus on the end goal. Useful for quick preliminary explorations. Provides functions for common exploration patterns. Some of the ideas in this package are motivated by Fox (2015, ISBN:1938377052).
Maintained by Viswa Viswanathan. Last updated 4 years ago.
5.5 match 3.70 score 4 scriptsjanmarvin
openxlsx2:Read, Write and Edit 'xlsx' Files
Simplifies the creation of 'xlsx' files by providing a high level interface to writing, styling and editing worksheets.
Maintained by Jan Marvin Garbuszus. Last updated 16 hours ago.
1.5 match 138 stars 13.67 score 194 scripts 11 dependentsr4ss
r4ss:R Code for Stock Synthesis
A collection of R functions for use with Stock Synthesis, a fisheries stock assessment modeling platform written in ADMB by Dr. Richard D. Methot at the NOAA Northwest Fisheries Science Center. The functions include tools for summarizing and plotting results, manipulating files, visualizing model parameterizations, and various other common stock assessment tasks. This version of '{r4ss}' is compatible with Stock Synthesis versions 3.24 through 3.30 (specifically version 3.30.23.1, from December 2024). Support for 3.24 models is only through the core functions for reading output and plotting.
Maintained by Ian G. Taylor. Last updated 4 days ago.
fisheriesfisheries-stock-assessmentstock-synthesis
1.8 match 43 stars 11.38 score 1.0k scripts 2 dependentsepiforecasts
scoringutils:Utilities for Scoring and Assessing Predictions
Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.
Maintained by Nikos Bosse. Last updated 12 days ago.
forecast-evaluationforecasting
1.8 match 52 stars 11.37 score 326 scripts 7 dependentsbioc
COCOA:Coordinate Covariation Analysis
COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.
Maintained by John Lawson. Last updated 5 months ago.
epigeneticsdnamethylationatacseqdnaseseqmethylseqmethylationarrayprincipalcomponentgenomicvariationgeneregulationgenomeannotationsystemsbiologyfunctionalgenomicschipseqsequencingimmunooncologydna-methylationpca
2.9 match 10 stars 7.02 score 21 scriptsstemangiola
tidyHeatmap:A Tidy Implementation of Heatmap
This is a tidy implementation for heatmap. At the moment it is based on the (great) package 'ComplexHeatmap'. The goal of this package is to interface a tidy data frame with this powerful tool. Some of the advantages are: Row and/or columns colour annotations are easy to integrate just specifying one parameter (column names). Custom grouping of rows is easy to specify providing a grouped tbl. For example: df %>% group_by(...). Labels size adjusted by row and column total number. Default use of Brewer and Viridis palettes.
Maintained by Stefano Mangiola. Last updated 1 months ago.
assaydomaininfrastructurebrewercomplexheatmapcustom-palettedplyrgraphvizheatmapmtcarsplottingrstudioscaletibbletidytidy-data-frametidybulktidyverseviridis
2.0 match 335 stars 10.23 score 197 scripts 1 dependentsgianmarcoalberti
CAinterprTools:Graphical Aid in Correspondence Analysis Interpretation and Significance Testings
Allows to plot a number of information related to the interpretation of Correspondence Analysis' results. It provides the facility to plot the contribution of rows and columns categories to the principal dimensions, the quality of points display on selected dimensions, the correlation of row and column categories to selected dimensions, etc. It also allows to assess which dimension(s) is important for the data structure interpretation by means of different statistics and tests. The package also offers the facility to plot the permuted distribution of the table total inertia as well as of the inertia accounted for by pairs of selected dimensions. Different facilities are also provided that aim to produce interpretation-oriented scatterplots. Reference: Alberti 2015 <doi:10.1016/j.softx.2015.07.001>.
Maintained by Gianmarco Alberti. Last updated 5 years ago.
8.1 match 2.52 score 33 scriptsclbustos
dominanceanalysis:Dominance Analysis
Dominance analysis is a method that allows to compare the relative importance of predictors in multiple regression models: ordinary least squares, generalized linear models, hierarchical linear models, beta regression and dynamic linear models. The main principles and methods of dominance analysis are described in Budescu, D. V. (1993) <doi:10.1037/0033-2909.114.3.542> and Azen, R., & Budescu, D. V. (2003) <doi:10.1037/1082-989X.8.2.129> for ordinary least squares regression. Subsequently, the extensions for multivariate regression, logistic regression and hierarchical linear models were described in Azen, R., & Budescu, D. V. (2006) <doi:10.3102/10769986031002157>, Azen, R., & Traxel, N. (2009) <doi:10.3102/1076998609332754> and Luo, W., & Azen, R. (2013) <doi:10.3102/1076998612458319>, respectively.
Maintained by Claudio Bustos Navarrete. Last updated 1 years ago.
3.5 match 25 stars 5.75 score 45 scriptswelch-lab
rliger:Linked Inference of Genomic Experimental Relationships
Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.
Maintained by Yichen Wang. Last updated 2 months ago.
nonnegative-matrix-factorizationsingle-cellopenblascpp
1.9 match 402 stars 10.80 score 334 scripts 1 dependentsbioc
mistyR:Multiview Intercellular SpaTial modeling framework
mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.
Maintained by Jovan Tanevski. Last updated 5 months ago.
softwarebiomedicalinformaticscellbiologysystemsbiologyregressiondecisiontreesinglecellspatialbioconductorbiologyintercellularmachine-learningmodularmolecular-biologymultiviewspatial-transcriptomics
2.6 match 51 stars 7.87 score 160 scriptsdipterix
dipsaus:A Dipping Sauce for Data Analysis and Visualizations
Works as an "add-on" to packages like 'shiny', 'future', as well as 'rlang', and provides utility functions. Just like dipping sauce adding flavors to potato chips or pita bread, 'dipsaus' for data analysis and visualizations adds handy functions and enhancements to popular packages. The goal is to provide simple solutions that are frequently asked for online, such as how to synchronize 'shiny' inputs without freezing the app, or how to get memory size on 'Linux' or 'MacOS' system. The enhancements roughly fall into these four categories: 1. 'shiny' input widgets; 2. high-performance computing using the 'future' package; 3. modify R calls and convert among numbers, strings, and other objects. 4. utility functions to get system information such like CPU chip-set, memory limit, etc.
Maintained by Zhengjia Wang. Last updated 4 days ago.
2.5 match 13 stars 7.90 score 85 scripts 3 dependentsbioc
MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Maintained by Laurent Gatto. Last updated 1 days ago.
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
1.5 match 130 stars 12.81 score 772 scripts 36 dependentsboost-r
FDboost:Boosting Functional Regression Models
Regression models for functional data, i.e., scalar-on-function, function-on-scalar and function-on-function regression models, are fitted by a component-wise gradient boosting algorithm. For a manual on how to use 'FDboost', see Brockhaus, Ruegamer, Greven (2017) <doi:10.18637/jss.v094.i10>.
Maintained by David Ruegamer. Last updated 3 months ago.
boostingboosting-algorithmsfunction-on-function-regressionfunction-on-scalar-regressionmachine-learningscalar-on-function-regressionvariable-selection
2.5 match 17 stars 8.00 score 98 scriptshturner
gnm:Generalized Nonlinear Models
Functions to specify and fit generalized nonlinear models, including models with multiplicative interaction terms such as the UNIDIFF model from sociology and the AMMI model from crop science, and many others. Over-parameterized representations of models are used throughout; functions are provided for inference on estimable parameter combinations, as well as standard methods for diagnostics etc.
Maintained by Heather Turner. Last updated 1 years ago.
generalized-linear-modelsgeneralized-nonlinear-modelsstatistical-modelsopenblas
1.9 match 16 stars 10.51 score 290 scripts 21 dependentsnimble-dev
nimble:MCMC, Particle Filtering, and Programmable Hierarchical Modeling
A system for writing hierarchical statistical models largely compatible with 'BUGS' and 'JAGS', writing nimbleFunctions to operate models and do basic R-style math, and compiling both models and nimbleFunctions via custom-generated C++. 'NIMBLE' includes default methods for MCMC, Laplace Approximation, Monte Carlo Expectation Maximization, and some other tools. The nimbleFunction system makes it easy to do things like implement new MCMC samplers from R, customize the assignment of samplers to different parts of a model from R, and compile the new samplers automatically via C++ alongside the samplers 'NIMBLE' provides. 'NIMBLE' extends the 'BUGS'/'JAGS' language by making it extensible: New distributions and functions can be added, including as calls to external compiled code. Although most people think of MCMC as the main goal of the 'BUGS'/'JAGS' language for writing models, one can use 'NIMBLE' for writing arbitrary other kinds of model-generic algorithms as well. A full User Manual is available at <https://r-nimble.org>.
Maintained by Christopher Paciorek. Last updated 4 days ago.
bayesian-inferencebayesian-methodshierarchical-modelsmcmcprobabilistic-programmingopenblascpp
1.5 match 169 stars 12.97 score 2.6k scripts 19 dependentsbayesiandemography
bage:Bayesian Estimation and Forecasting of Age-Specific Rates
Fast Bayesian estimation and forecasting of age-specific rates, probabilities, and means, based on 'Template Model Builder'.
Maintained by John Bryant. Last updated 2 months ago.
2.7 match 3 stars 7.30 score 39 scriptstvganesh
cricketr:Analyze Cricketers and Cricket Teams Based on ESPN Cricinfo Statsguru
Tools for analyzing performances of cricketers based on stats in ESPN Cricinfo Statsguru. The toolset can be used for analysis of Tests,ODIs and Twenty20 matches of both batsmen and bowlers. The package can also be used to analyze team performances.
Maintained by Tinniam V Ganesh. Last updated 4 years ago.
3.5 match 62 stars 5.55 score 115 scriptsbioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
2.0 match 183 stars 9.70 score 126 scripts 1 dependentsjranke
mkin:Kinetic Evaluation of Chemical Degradation Data
Calculation routines based on the FOCUS Kinetics Report (2006, 2014). Includes a function for conveniently defining differential equation models, model solution based on eigenvalues if possible or using numerical solvers. If a C compiler (on windows: 'Rtools') is installed, differential equation models are solved using automatically generated C functions. Non-constant errors can be taken into account using variance by variable or two-component error models <doi:10.3390/environments6120124>. Hierarchical degradation models can be fitted using nonlinear mixed-effects model packages as a back end <doi:10.3390/environments8080071>. Please note that no warranty is implied for correctness of results or fitness for a particular purpose.
Maintained by Johannes Ranke. Last updated 30 days ago.
degradationfocus-kineticskinetic-modelskineticsodeode-model
2.4 match 11 stars 8.06 score 78 scripts 1 dependentsbioc
cytomapper:Visualization of highly multiplexed imaging data in R
Highly multiplexed imaging acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualised across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualised on segmented cell areas. This package contains functions for the visualisation of multiplexed read-outs and cell-level information obtained by multiplexed imaging technologies. The main functions of this package allow 1. the visualisation of pixel-level information across multiple channels, 2. the display of cell-level information (expression and/or metadata) on segmentation masks and 3. gating and visualisation of single cells.
Maintained by Lasse Meyer. Last updated 5 months ago.
immunooncologysoftwaresinglecellonechanneltwochannelmultiplecomparisonnormalizationdataimportbioimagingimaging-mass-cytometrysingle-cellspatial-analysis
2.0 match 32 stars 9.61 score 354 scripts 5 dependentsironholds
WikipediR:A MediaWiki API Wrapper
A wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia. It can be used to retrieve page text, information about users or the history of pages, and elements of the category tree.
Maintained by Os Keyes. Last updated 11 months ago.
api-clientapi-wrappermediawiki
2.0 match 70 stars 9.56 score 81 scripts 32 dependentsinventionate
TimeSpaceAnalysis:Statistical tools for time-space analysis
Use Geometric Data Analysis approaches (e.g. MCA or MFA), time pattern analysis (see "time sequence clustering") and places chronologies (see "time geography") analysis.
Maintained by Fabian Mundt. Last updated 6 days ago.
7.7 match 2.48 score 2 scriptsbioc
MetaboCoreUtils:Core Utils for Metabolomics Data
MetaboCoreUtils defines metabolomics-related core functionality provided as low-level functions to allow a data structure-independent usage across various R packages. This includes functions to calculate between ion (adduct) and compound mass-to-charge ratios and masses or functions to work with chemical formulas. The package provides also a set of adduct definitions and information on some commercially available internal standard mixes commonly used in MS experiments.
Maintained by Johannes Rainer. Last updated 5 months ago.
infrastructuremetabolomicsmassspectrometrymass-spectrometry
2.0 match 9 stars 9.40 score 58 scripts 36 dependentsjrowen
rhandsontable:Interface to the 'Handsontable.js' Library
An R interface to the 'Handsontable' JavaScript library, which is a minimalist Excel-like data grid editor. See <https://handsontable.com/> for details.
Maintained by Jonathan Owen. Last updated 3 years ago.
handsontablehtmlwidgetsjavascriptshinysparkline
1.5 match 389 stars 12.31 score 1.0k scripts 46 dependentsbpfaff
FRAPO:Financial Risk Modelling and Portfolio Optimisation with R
Accompanying package of the book 'Financial Risk Modelling and Portfolio Optimisation with R', second edition. The data sets used in the book are contained in this package.
Maintained by Bernhard Pfaff. Last updated 8 years ago.
3.9 match 11 stars 4.71 score 94 scriptsbioc
MOFA2:Multi-Omics Factor Analysis v2
The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.
Maintained by Ricard Argelaguet. Last updated 5 months ago.
dimensionreductionbayesianvisualizationfactor-analysismofamulti-omics
1.8 match 319 stars 10.02 score 502 scriptsbiometry
bipartite:Visualising Bipartite Networks and Calculating Some (Ecological) Indices
Functions to visualise webs and calculate a series of indices commonly used to describe pattern in (ecological) webs. It focuses on webs consisting of only two levels (bipartite), e.g. pollination webs or predator-prey-webs. Visualisation is important to get an idea of what we are actually looking at, while the indices summarise different aspects of the web's topology.
Maintained by Carsten F. Dormann. Last updated 6 days ago.
1.7 match 37 stars 10.93 score 592 scripts 15 dependentsthongphamthe
PAFit:Generative Mechanism Estimation in Temporal Complex Networks
Statistical methods for estimating preferential attachment and node fitness generative mechanisms in temporal complex networks are provided. Thong Pham et al. (2015) <doi:10.1371/journal.pone.0137796>. Thong Pham et al. (2016) <doi:10.1038/srep32558>. Thong Pham et al. (2020) <doi:10.18637/jss.v092.i03>. Thong Pham et al. (2021) <doi:10.1093/comnet/cnab024>.
Maintained by Thong Pham. Last updated 12 months ago.
complex-networksfit-get-richergeneral-preferential-attachmentminorize-maximizationpreferential-attachmentrich-get-richerscale-freetemporal-networkscppopenmp
2.8 match 17 stars 6.47 score 70 scriptsbioc
podkat:Position-Dependent Kernel Association Test
This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.
Maintained by Ulrich Bodenhofer. Last updated 5 months ago.
geneticswholegenomeannotationvariantannotationsequencingdataimportcurlbzip2xz-utilszlibcpp
3.5 match 5.02 score 6 scriptsshixiangwang
sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations
Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called 'signature' (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.
Maintained by Shixiang Wang. Last updated 5 months ago.
bayesian-nmfbioinformaticscancer-researchcnvcopynumber-signaturescosmic-signaturesdbseasy-to-useindelmutational-signaturesnmfnmf-extractionsbssignature-extractionsomatic-mutationssomatic-variantsvisualizationcpp
1.9 match 150 stars 9.48 score 123 scripts 2 dependentsbioc
assorthead:Assorted Header-Only C++ Libraries
Vendors an assortment of useful header-only C++ libraries. Bioconductor packages can use these libraries in their own C++ code by LinkingTo this package without introducing any additional dependencies. The use of a central repository avoids duplicate vendoring of libraries across multiple R packages, and enables better coordination of version updates across cohorts of interdependent C++ libraries.
Maintained by Aaron Lun. Last updated 12 days ago.
singlecellqualitycontrolnormalizationdatarepresentationdataimportdifferentialexpressionalignment
2.0 match 8.89 score 167 dependentsstatmanrobin
Stat2Data:Datasets for Stat2
Datasets for the textbook Stat2: Modeling with Regression and ANOVA (second edition). The package also includes data for the first edition, Stat2: Building Models for a World of Data and a few functions for plotting diagnostics.
Maintained by Robin Lock. Last updated 6 years ago.
3.6 match 5 stars 4.94 score 544 scriptslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 23 days ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
1.8 match 233 stars 9.84 score 185 scripts 1 dependentsrudjer
SparseM:Sparse Linear Algebra
Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.
Maintained by Roger Koenker. Last updated 8 months ago.
1.5 match 3 stars 11.47 score 306 scripts 1.5k dependentsbioc
BiocBaseUtils:General utility functions for developing Bioconductor packages
The package provides utility functions related to package development. These include functions that replace slots, and selectors for show methods. It aims to coalesce the various helper functions often re-used throughout the Bioconductor ecosystem.
Maintained by Marcel Ramos. Last updated 5 months ago.
softwareinfrastructurebioconductor-packagecore-package
2.0 match 4 stars 8.78 score 3 scripts 158 dependentsagalecki
nlmeU:Datasets and Utility Functions Enhancing Functionality of 'nlme' Package
Datasets and utility functions enhancing functionality of nlme package. Datasets, functions and scripts are described in book titled 'Linear Mixed-Effects Models: A Step-by-Step Approach' by Galecki and Burzykowski (2013). Package is under development.
Maintained by Andrzej Galecki. Last updated 3 years ago.
3.4 match 5.08 score 135 scripts 6 dependentsadeverse
adegraphics:An S4 Lattice-Based Package for the Representation of Multivariate Data
Graphical functionalities for the representation of multivariate data. It is a complete re-implementation of the functions available in the 'ade4' package.
Maintained by Aurรฉlie Siberchicot. Last updated 8 months ago.
1.7 match 9 stars 10.37 score 386 scripts 6 dependentsaphalo
photobiologyFilters:Spectral Transmittance and Spectral Reflectance Data
Spectral 'transmittance' data for frequently used filters and similar materials. Plastic sheets and films; photography filters; theatrical gels; machine-vision filters; various types of window glass; optical glass and some laboratory plastics and glassware. Spectral reflectance data for frequently encountered materials. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.
Maintained by Pedro J. Aphalo. Last updated 5 days ago.
3.4 match 5.08 score 40 scriptsinlabru-org
fmesher:Triangle Meshes and Related Geometry Tools
Generate planar and spherical triangle meshes, compute finite element calculations for 1- and 2-dimensional flat and curved manifolds with associated basis function spaces, methods for lines and polygons, and transparent handling of coordinate reference systems and coordinate transformation, including 'sf' and 'sp' geometries. The core 'fmesher' library code was originally part of the 'INLA' package, and implements parts of "Triangulations and Applications" by Hjelle and Daehlen (2006) <doi:10.1007/3-540-33261-8>.
Maintained by Finn Lindgren. Last updated 2 days ago.
1.5 match 16 stars 11.18 score 261 scripts 26 dependentsbioc
affy:Methods for Affymetrix Oligonucleotide Arrays
The package contains functions for exploratory oligonucleotide array analysis. The dependence on tkWidgets only concerns few convenience functions. 'affy' is fully functional without it.
Maintained by Robert D. Shear. Last updated 2 months ago.
microarrayonechannelpreprocessing
1.5 match 11.12 score 2.5k scripts 98 dependentselies-ramon
kerntools:Kernel Functions and Tools for Machine Learning Applications
Kernel functions for diverse types of data (including, but not restricted to: nonnegative and real vectors, real matrices, categorical and ordinal variables, sets, strings), plus other utilities like kernel similarity, kernel Principal Components Analysis (PCA) and features' importance for Support Vector Machines (SVMs), which expand other 'R' packages like 'kernlab'.
Maintained by Elies Ramon. Last updated 25 days ago.
3.5 match 1 stars 4.73 score 12 scriptsbioc
BASiCS:Bayesian Analysis of Single-Cell Sequencing data
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.
Maintained by Catalina Vallejos. Last updated 5 months ago.
immunooncologynormalizationsequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecelldifferentialexpressionbayesiancellbiologybioconductor-packagegene-expressionrcpprcpparmadilloscrna-seqsingle-cellopenblascppopenmp
1.6 match 83 stars 10.26 score 368 scripts 1 dependentstvganesh
yorkr:Analyze Cricket Performances Based on Data from Cricsheet
Analyzing performances of cricketers and cricket teams based on 'yaml' match data from Cricsheet <https://cricsheet.org/>.
Maintained by Tinniam V Ganesh. Last updated 2 years ago.
3.3 match 17 stars 5.00 score 118 scriptswadpac
GGIR:Raw Accelerometer Data Analysis
A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.
Maintained by Vincent T van Hees. Last updated 1 days ago.
accelerometeractivity-recognitioncircadian-rhythmmovement-sensorsleep
1.3 match 109 stars 13.20 score 342 scripts 3 dependentsdamianobaldan
riverconn:Fragmentation and Connectivity Indices for Riverscapes
Indices for assessing riverscape fragmentation, including the Dendritic Connectivity Index, the Population Connectivity Index, the River Fragmentation Index, the Probability of Connectivity, and the Integral Index of connectivity. For a review, see Jumani et al. (2020) <doi:10.1088/1748-9326/abcb37> and Baldan et al. (2022) <doi:10.1016/j.envsoft.2022.105470> Functions to calculate temporal indices improvement when fragmentation due to barriers is reduced are also included.
Maintained by Damiano Baldan. Last updated 12 months ago.
3.4 match 9 stars 4.77 score 13 scriptsrekyt
fdcoexist:Multi-Species Trait-Based Coexistence Model in Discrete time
A modified Beverton-Holt model used in the Denelle, Greniรฉ et al. manuscript that expresses environmental filtering, limiting similarity and hierarchical competition explicitely in function of species traits. This package provides all the code necessary to rerun the analyses of the manuscript.
Maintained by Matthias Greniรฉ. Last updated 2 years ago.
6.0 match 2.70 score 1 scriptsluisagi
enmpa:Ecological Niche Modeling using Presence-Absence Data
A set of tools to perform Ecological Niche Modeling with presence-absence data. It includes algorithms for data partitioning, model fitting, calibration, evaluation, selection, and prediction. Other functions help to explore signals of ecological niche using univariate and multivariate analyses, and model features such as variable response curves and variable importance. Unique characteristics of this package are the ability to exclude models with concave quadratic responses, and the option to clamp model predictions to specific variables. These tools are implemented following principles proposed in Cobos et al., (2022) <doi:10.17161/bi.v17i.15985>, Cobos et al., (2019) <doi:10.7717/peerj.6281>, and Peterson et al., (2008) <doi:10.1016/j.ecolmodel.2007.11.008>.
Maintained by Luis F. Arias-Giraldo. Last updated 3 months ago.
3.8 match 5 stars 4.35 score 5 scriptsbioc
decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting
Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.
Maintained by Rosario M. Piro. Last updated 5 months ago.
softwaresnpsequencingdnaseqgenomicvariationsomaticmutationbiomedicalinformaticsgeneticsbiologicalquestionstatisticalmethod
3.4 match 1 stars 4.78 score 10 scripts 1 dependentshypertidy
PROJ:Generic Coordinate System Transformations Using 'PROJ'
A wrapper around the generic coordinate transformation software 'PROJ' that transforms coordinates from one coordinate reference system ('CRS') to another. This includes cartographic projections as well as geodetic transformations. The intention is for this package to be used by user-packages such as 'reproj', and that the older 'PROJ.4' and version 5 pathways be provided by the 'proj4' package.
Maintained by Michael D. Sumner. Last updated 9 months ago.
1.5 match 16 stars 10.53 score 82 scripts 27 dependentsrobustport
facmodTS:Time Series Models for Asset Returns
Supports teaching methods of estimating and testing time series models for use in robust portfolio construction and analysis. Unique in providing not only classical least squares, but also modern robust model fitting methods which are not much influenced by outliers. Includes returns and risk decompositions, with user choice of standard deviation, value-at-risk, and expected shortfall risk measures. "Robust Statistics Theory and Methods (with R)", R. A. Maronna, R. D. Martin, V. J. Yohai, M. Salibian-Barrera (2019) <doi:10.1002/9781119214656>.
Maintained by Doug Martin. Last updated 8 days ago.
5.3 match 1 stars 3.00 scoreleelabsg
SKAT:SNP-Set (Sequence) Kernel Association Test
Functions for kernel-regression-based association tests including Burden test, SKAT and SKAT-O. These methods aggregate individual SNP score statistics in a SNP set and efficiently compute SNP-set level p-values.
Maintained by Seunggeun (Shawn) Lee. Last updated 1 months ago.
1.7 match 45 stars 9.70 score 268 scripts 16 dependentsnanne-aben
TANDEM:A Two-Stage Approach to Maximize Interpretability of Drug Response Models Based on Multiple Molecular Data Types
A two-stage regression method that can be used when various input data types are correlated, for example gene expression and methylation in drug response prediction. In the first stage it uses the upstream features (such as methylation) to predict the response variable (such as drug response), and in the second stage it uses the downstream features (such as gene expression) to predict the residuals of the first stage. In our manuscript (Aben et al., 2016, <doi:10.1093/bioinformatics/btw449>), we show that using TANDEM prevents the model from being dominated by gene expression and that the features selected by TANDEM are more interpretable.
Maintained by Nanne Aben. Last updated 5 years ago.
4.0 match 2 stars 4.00 score 9 scriptsberwinturlach
quadprog:Functions to Solve Quadratic Programming Problems
This package contains routines and documentation for solving quadratic programming problems.
Maintained by Berwin A. Turlach. Last updated 5 years ago.
1.6 match 3 stars 10.27 score 972 scripts 1.2k dependentspredictiveecology
SpaDES.core:Core Utilities for Developing and Running Spatially Explicit Discrete Event Models
Provides the core framework for a discrete event system to implement a complete data-to-decisions, reproducible workflow. The core components facilitate the development of modular pieces, and enable the user to include additional functionality by running user-built modules. Includes conditional scheduling, restart after interruption, packaging of reusable modules, tools for developing arbitrary automated workflows, automated interweaving of modules of different temporal resolution, and tools for visualizing and understanding the within-project dependencies. The suggested package 'NLMR' can be installed from the repository (<https://PredictiveEcology.r-universe.dev>).
Maintained by Eliot J B McIntire. Last updated 18 days ago.
discrete-events-simulationssimulation-frameworksimulation-modeling
1.5 match 10 stars 10.61 score 142 scripts 6 dependentsepiverse-trace
simulist:Simulate Disease Outbreak Line List and Contacts Data
Tools to simulate realistic raw case data for an epidemic in the form of line lists and contacts using a branching process. Simulated outbreaks are parameterised with epidemiological parameters and can have age-structured populations, age-stratified hospitalisation and death risk and time-varying case fatality risk.
Maintained by Joshua W. Lambert. Last updated 2 days ago.
epidemiologyepiverselinelistoutbreaks
2.0 match 9 stars 7.86 score 27 scriptsxiangzhou09
rbw:Residual Balancing Weights for Marginal Structural Models
Residual balancing is a robust method of constructing weights for marginal structural models, which can be used to estimate (a) the average treatment effect in a cross-sectional observational study, (b) controlled direct/mediator effects in causal mediation analysis, and (c) the effects of time-varying treatments in panel data (Zhou and Wodtke 2020 <doi:10.1017/pan.2020.2>). This package provides three functions, rbwPoint(), rbwMed(), and rbwPanel(), that produce residual balancing weights for estimating (a), (b), (c), respectively.
Maintained by Xiang Zhou. Last updated 3 years ago.
3.4 match 9 stars 4.65 score 5 scriptsikosmidis
brglm2:Bias Reduction in Generalized Linear Models
Estimation and inference from generalized linear models based on various methods for bias reduction and maximum penalized likelihood with powers of the Jeffreys prior as penalty. The 'brglmFit' fitting method can achieve reduction of estimation bias by solving either the mean bias-reducing adjusted score equations in Firth (1993) <doi:10.1093/biomet/80.1.27> and Kosmidis and Firth (2009) <doi:10.1093/biomet/asp055>, or the median bias-reduction adjusted score equations in Kenne et al. (2017) <doi:10.1093/biomet/asx046>, or through the direct subtraction of an estimate of the bias of the maximum likelihood estimator from the maximum likelihood estimates as in Cordeiro and McCullagh (1991) <https://www.jstor.org/stable/2345592>. See Kosmidis et al (2020) <doi:10.1007/s11222-019-09860-6> for more details. Estimation in all cases takes place via a quasi Fisher scoring algorithm, and S3 methods for the construction of of confidence intervals for the reduced-bias estimates are provided. In the special case of generalized linear models for binomial and multinomial responses (both ordinal and nominal), the adjusted score approaches to mean and media bias reduction have been found to return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation; see Kosmidis and Firth, 2020 <doi:10.1093/biomet/asaa052>, for a proof for mean bias reduction in logistic regression).
Maintained by Ioannis Kosmidis. Last updated 6 months ago.
adjusted-score-equationsalgorithmsbias-reducing-adjustmentsbias-reductionestimationglmlogistic-regressionnominal-responsesordinal-responsesregressionregression-algorithmsstatistics
1.5 match 32 stars 10.41 score 106 scripts 10 dependentsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 6 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
2.0 match 33 stars 7.77 score 10 scriptsjclavel
mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data
Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.
Maintained by Julien Clavel. Last updated 1 months ago.
1.6 match 17 stars 9.46 score 189 scripts 3 dependentsfabsig
gpboost:Combining Tree-Boosting with Gaussian Process and Mixed Effects Models
An R package that allows for combining tree-boosting with Gaussian process and mixed effects models. It also allows for independently doing tree-boosting as well as inference and prediction for Gaussian process and mixed effects models. See <https://github.com/fabsig/GPBoost> for more information on the software and Sigrist (2022, JMLR) <https://www.jmlr.org/papers/v23/20-322.html> and Sigrist (2023, TPAMI) <doi:10.1109/TPAMI.2022.3168152> for more information on the methodology.
Maintained by Fabio Sigrist. Last updated 25 days ago.
3.7 match 4.20 score 212 scriptspaulnorthrop
lax:Loglikelihood Adjustment for Extreme Value Models
Performs adjusted inferences based on model objects fitted, using maximum likelihood estimation, by the extreme value analysis packages 'eva' <https://cran.r-project.org/package=eva>, 'evd' <https://cran.r-project.org/package=evd>, 'evir' <https://cran.r-project.org/package=evir>, 'extRemes' <https://cran.r-project.org/package=extRemes>, 'fExtremes' <https://cran.r-project.org/package=fExtremes>, 'ismev' <https://cran.r-project.org/package=ismev>, 'mev' <https://cran.r-project.org/package=mev>, 'POT' <https://cran.r-project.org/package=POT> and 'texmex' <https://cran.r-project.org/package=texmex>. Adjusted standard errors and an adjusted loglikelihood are provided, using the 'chandwich' package <https://cran.r-project.org/package=chandwich> and the object-oriented features of the 'sandwich' package <https://cran.r-project.org/package=sandwich>. The adjustment is based on a robust sandwich estimator of the parameter covariance matrix, based on the methodology in Chandler and Bate (2007) <doi:10.1093/biomet/asm015>. This can be used for cluster correlated data when interest lies in the parameters of the marginal distributions, or for performing inferences that are robust to certain types of model misspecification. Univariate extreme value models, including regression models, are supported.
Maintained by Paul J. Northrop. Last updated 1 years ago.
clustered-dataclusterscomposite-likelihoodevdextreme-value-analysisextreme-value-statisticsextremesindependence-loglikelihoodloglikelihood-adjustmentmlepotregressionregression-modellingrobustsandwichsandwich-estimator
3.6 match 3 stars 4.29 score 13 scriptsbblonder
hypervolume:High Dimensional Geometry, Set Operations, Projection, and Inference Using Kernel Density Estimation, Support Vector Machines, and Convex Hulls
Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.
Maintained by Benjamin Blonder. Last updated 2 months ago.
1.6 match 23 stars 9.75 score 211 scripts 7 dependentspharmaverse
sdtmchecks:Data Quality Checks for Study Data Tabulation Model (SDTM) Datasets
A series of checks to identify common issues in Study Data Tabulation Model (SDTM) datasets. These checks are intended to be generalizable, actionable, and meaningful for analysis.
Maintained by Will Harris. Last updated 3 months ago.
2.0 match 21 stars 7.66 score 15 scriptsr-gregmisc
gmodels:Various R Programming Tools for Model Fitting
Various R programming tools for model fitting.
Maintained by Gregory R. Warnes. Last updated 3 months ago.
1.5 match 1 stars 10.01 score 3.5k scripts 30 dependentsbnaras
multiview:Cooperative Learning for Multi-View Analysis
Cooperative learning combines the usual squared error loss of predictions with an agreement penalty to encourage the predictions from different data views to agree. By varying the weight of the agreement penalty, we get a continuum of solutions that include the well-known early and late fusion approaches. Cooperative learning chooses the degree of agreement (or fusion) in an adaptive manner, using a validation set or cross-validation to estimate test set prediction error. In the setting of cooperative regularized linear regression, the method combines the lasso penalty with the agreement penalty (Ding, D., Li, S., Narasimhan, B., Tibshirani, R. (2021) <doi:10.1073/pnas.2202113119>).
Maintained by Balasubramanian Narasimhan. Last updated 2 years ago.
5.2 match 2.95 score 18 scriptsbioc
Rsubread:Mapping, quantification and variant analysis of sequencing data
Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.
Maintained by Wei Shi. Last updated 1 days ago.
sequencingalignmentsequencematchingrnaseqchipseqsinglecellgeneexpressiongeneregulationgeneticsimmunooncologysnpgeneticvariabilitypreprocessingqualitycontrolgenomeannotationgenefusiondetectionindeldetectionvariantannotationvariantdetectionmultiplesequencealignmentzlib
1.7 match 9.24 score 892 scripts 10 dependentsbczernecki
climate:Interface to Download Meteorological (and Hydrological) Datasets
Automatize downloading of meteorological and hydrological data from publicly available repositories: OGIMET (<http://ogimet.com/index.phtml.en>), University of Wyoming - atmospheric vertical profiling data (<http://weather.uwyo.edu/upperair/>), Polish Institute of Meteorology and Water Management - National Research Institute (<https://danepubliczne.imgw.pl>), and National Oceanic & Atmospheric Administration (NOAA). This package also allows for searching geographical coordinates for each observation and calculate distances to the nearest stations.
Maintained by Bartosz Czernecki. Last updated 9 days ago.
climateclimate-dataimgwmeteorological-datameteorologynoaa-dataogimetsounding
2.0 match 88 stars 7.61 score 38 scriptscovaruber
evola:Evolutionary Algorithm
Runs a genetic algorithm using the 'AlphaSimR' machinery <doi:10.1093/g3journal/jkaa017> and the coalescent simulator 'MaCS' <doi:10.1101/gr.083634.108>.
Maintained by Giovanny Covarrubias-Pazaran. Last updated 5 days ago.
3.6 match 1 stars 4.23 score 3 scriptsbioc
imcRtools:Methods for imaging mass cytometry data analysis
This R package supports the handling and analysis of imaging mass cytometry and other highly multiplexed imaging data. The main functionality includes reading in single-cell data after image segmentation and measurement, data formatting to perform channel spillover correction and a number of spatial analysis approaches. First, cell-cell interactions are detected via spatial graph construction; these graphs can be visualized with cells representing nodes and interactions representing edges. Furthermore, per cell, its direct neighbours are summarized to allow spatial clustering. Per image/grouping level, interactions between types of cells are counted, averaged and compared against random permutations. In that way, types of cells that interact more (attraction) or less (avoidance) frequently than expected by chance are detected.
Maintained by Daniel Schulz. Last updated 5 months ago.
immunooncologysinglecellspatialdataimportclusteringimcsingle-cell
2.0 match 24 stars 7.58 score 126 scriptsbioc
tradeSeq:trajectory-based differential expression analysis for sequencing data
tradeSeq provides a flexible method for fitting regression models that can be used to find genes that are differentially expressed along one or multiple lineages in a trajectory. Based on the fitted models, it uses a variety of tests suited to answer different questions of interest, e.g. the discovery of genes for which expression is associated with pseudotime, or which are differentially expressed (in a specific region) along the trajectory. It fits a negative binomial generalized additive model (GAM) for each gene, and performs inference on the parameters of the GAM.
Maintained by Hector Roux de Bezieux. Last updated 5 months ago.
clusteringregressiontimecoursedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsmultiplecomparisonvisualization
1.5 match 247 stars 10.06 score 440 scriptsadibender
pammtools:Piece-Wise Exponential Additive Mixed Modeling Tools for Survival Analysis
The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated, competing risks and recurrent events data. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.
Maintained by Andreas Bender. Last updated 2 months ago.
additive-modelspammpammtoolspiece-wise-exponentialsurvival-analysis
1.7 match 48 stars 8.78 score 310 scripts 8 dependentsepiverse-trace
epidemics:Composable Epidemic Scenario Modelling
A library of compartmental epidemic models taken from the published literature, and classes to represent affected populations, public health response measures including non-pharmaceutical interventions on social contacts, non-pharmaceutical and pharmaceutical interventions that affect disease transmissibility, vaccination regimes, and disease seasonality, which can be combined to compose epidemic scenario models.
Maintained by Rosalind Eggo. Last updated 9 months ago.
decision-supportepidemic-modellingepidemic-simulationsepidemiologyepiverseinfectious-disease-dynamicsmodel-librarynon-pharmaceutical-interventionsrcpprcppeigenscenario-analysisvaccinationcpp
2.0 match 9 stars 7.48 score 59 scriptsr-multiverse
multitools:Tools for Contributing Packages to R-multiverse
'R-multiverse' is a community-curated collection of R package releases, powered by 'R-universe'. The 'multitools' package has tools for maintainers of packages in 'R-multiverse'.
Maintained by William Michael Landau. Last updated 10 months ago.
4.8 match 3 stars 3.13 scoremomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
2.0 match 51 stars 7.42 score 346 scriptsms609
TreeSearch:Phylogenetic Analysis with Discrete Character Data
Reconstruct phylogenetic trees from discrete data. Inapplicable character states are handled using the algorithm of Brazeau, Guillerme and Smith (2019) <doi:10.1093/sysbio/syy083> with the "Morphy" library, under equal or implied step weights. Contains a "shiny" user interface for interactive tree search and exploration of results, including character visualization, rogue taxon detection, tree space mapping, and cluster consensus trees (Smith 2022a, b) <doi:10.1093/sysbio/syab099>, <doi:10.1093/sysbio/syab100>. Profile Parsimony (Faith and Trueman, 2001) <doi:10.1080/10635150118627>, Successive Approximations (Farris, 1969) <doi:10.2307/2412182> and custom optimality criteria are implemented.
Maintained by Martin R. Smith. Last updated 3 days ago.
bioinformaticsmorphological-analysisphylogeneticsresearch-tooltree-searchcpp
1.9 match 7 stars 7.89 score 51 scriptstommyjones
tidylda:Latent Dirichlet Allocation Using 'tidyverse' Conventions
Implements an algorithm for Latent Dirichlet Allocation (LDA), Blei et at. (2003) <https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf>, using style conventions from the 'tidyverse', Wickham et al. (2019)<doi:10.21105/joss.01686>, and 'tidymodels', Kuhn et al.<https://tidymodels.github.io/model-implementation-principles/>. Fitting is done via collapsed Gibbs sampling. Also implements several novel features for LDA such as guided models and transfer learning.
Maintained by Tommy Jones. Last updated 2 months ago.
2.0 match 41 stars 7.36 score 53 scriptsbioc
variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments
Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.
Maintained by Gabriel E. Hoffman. Last updated 2 months ago.
rnaseqgeneexpressiongenesetenrichmentdifferentialexpressionbatcheffectqualitycontrolregressionepigeneticsfunctionalgenomicstranscriptomicsnormalizationpreprocessingmicroarrayimmunooncologysoftware
1.3 match 7 stars 11.69 score 1.1k scripts 3 dependentsropensci
osmextract:Download and Import Open Street Map Data Extracts
Match, download, convert and import Open Street Map data extracts obtained from several providers.
Maintained by Andrea Gilardi. Last updated 2 months ago.
geogeofabrik-zoneopen-dataosmosm-pbf
1.5 match 173 stars 9.73 score 342 scriptswaldronlab
SingleCellMultiModal:Integrating Multi-modal Single Cell Experiment datasets
SingleCellMultiModal is an ExperimentHub package that serves multiple datasets obtained from GEO and other sources and represents them as MultiAssayExperiment objects. We provide several multi-modal datasets including scNMT, 10X Multiome, seqFISH, CITEseq, SCoPE2, and others. The scope of the package is is to provide data for benchmarking and analysis. To cite, use the 'citation' function and see <https://doi.org/10.1371/journal.pcbi.1011324>.
Maintained by Marcel Ramos. Last updated 4 months ago.
experimentdatasinglecelldatareproducibleresearchexperimenthubgeobioconductor-packageu24ca289073
2.0 match 17 stars 7.29 score 60 scriptsbioc
parglms:support for parallelized estimation of GLMs/GEEs
This package provides support for parallelized estimation of GLMs/GEEs, catering for dispersed data.
Maintained by VJ Carey. Last updated 5 months ago.
4.4 match 3.30 score 3 scripts