Showing 200 of total 278 results (show query)
koheiw
newsmap:Semi-Supervised Model for Geographical Document Classification
Semissupervised model for geographical document classification (Watanabe 2018) <doi:10.1080/21670811.2017.1293487>. This package currently contains seed dictionaries in English, German, French, Spanish, Italian, Russian, Hebrew, Arabic, Turkish, Japanese and Chinese (Simplified and Traditional).
Maintained by Kohei Watanabe. Last updated 9 months ago.
machine-learningnews-storiesquantedatext-analysis
49.0 match 62 stars 6.05 score 8 scriptsnewmi1988
seeds:Estimate Hidden Inputs using the Dynamic Elastic Net
Algorithms to calculate the hidden inputs of systems of differential equations. These hidden inputs can be interpreted as a control that tries to minimize the discrepancies between a given model and taken measurements. The idea is also called the Dynamic Elastic Net, as proposed in the paper "Learning (from) the errors of a systems biology model" (Engelhardt, Froelich, Kschischo 2016) <doi:10.1038/srep20772>. To use the experimental SBML import function, the 'rsbml' package is required. For installation I refer to the official 'rsbml' page: <https://bioconductor.org/packages/release/bioc/html/rsbml.html>.
Maintained by Tobias Newmiwaka. Last updated 4 years ago.
56.3 match 3.00 score 2 scriptsbioc
DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets
Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.
Maintained by Hervรฉ Pagรจs. Last updated 1 months ago.
infrastructuredatarepresentationannotationgenomeannotationbioconductor-packagecore-packageu24ca289073
9.1 match 27 stars 15.59 score 538 scripts 1.2k dependentsstatmanrobin
Stat2Data:Datasets for Stat2
Datasets for the textbook Stat2: Modeling with Regression and ANOVA (second edition). The package also includes data for the first edition, Stat2: Building Models for a World of Data and a few functions for plotting diagnostics.
Maintained by Robin Lock. Last updated 6 years ago.
27.9 match 5 stars 4.94 score 544 scriptsflavjack
GerminaR:Indices and Graphics for Assess Seed Germination Process
A collection of different indices and visualization techniques for evaluate the seed germination process in ecophysiological studies (Lozano-Isla et al. 2019) <doi:10.1111/1440-1703.1275>.
Maintained by Flavio Lozano-Isla. Last updated 3 days ago.
germianaquantinkaverseplant-scienceseed-germinationshinyapp
20.9 match 4 stars 6.29 score 81 scriptsaravind-j
germinationmetrics:Seed Germination Indices and Curve Fitting
Provides functions to compute various germination indices such as germinability, median germination time, mean germination time, mean germination rate, speed of germination, Timson's index, germination value, coefficient of uniformity of germination, uncertainty of germination process, synchrony of germination etc. from germination count data. Includes functions for fitting cumulative seed germination curves using four-parameter hill function and computation of associated parameters. See the vignette for more, including full list of citations for the methods implemented.
Maintained by J. Aravind. Last updated 5 months ago.
curve-fittinggerminationgermination-indicesseedseed-germination-curveseed-germination-indices
29.5 match 3 stars 4.38 score 26 scriptskwstat
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 1 months ago.
10.5 match 126 stars 10.78 score 1.7k scripts 1 dependentsropensci
targets:Dynamic Function-Oriented 'Make'-Like Declarative Pipelines
Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).
Maintained by William Michael Landau. Last updated 2 days ago.
data-sciencehigh-performance-computingmakepeer-reviewedpipeliner-targetopiareproducibilityreproducible-researchtargetsworkflow
7.3 match 975 stars 15.18 score 4.6k scripts 22 dependentsjulianfaraway
faraway:Datasets and Functions for Books by Julian Faraway
Books are "Linear Models with R" published 1st Ed. August 2004, 2nd Ed. July 2014, 3rd Ed. February 2025 by CRC press, ISBN 9781439887332, and "Extending the Linear Model with R" published by CRC press in 1st Ed. December 2005 and 2nd Ed. March 2016, ISBN 9781584884248 and "Practical Regression and ANOVA in R" contributed documentation on CRAN (now very dated).
Maintained by Julian Faraway. Last updated 1 months ago.
10.8 match 29 stars 9.43 score 1.7k scripts 1 dependentstacazares
SeedMatchR:Find Matches to Canonical SiRNA Seeds in Genomic Features
On-target gene knockdown using siRNA ideally results from binding fully complementary regions in mRNA transcripts to induce cleavage. Off-target siRNA gene knockdown can occur through several modes, one being a seed-mediated mechanism mimicking miRNA gene regulation. Seed-mediated off-target effects occur when the ~8 nucleotides at the 5โ end of the guide strand, called a seed region, bind the 3โ untranslated regions of mRNA, causing reduced translation. Experiments using siRNA knockdown paired with RNA-seq can be used to detect siRNA sequences with potential off-target effects driven by the seed region. 'SeedMatchR' provides tools for exploring and detecting potential seed-mediated off-target effects of siRNA in RNA-seq experiments. 'SeedMatchR' is designed to extend current differential expression analysis tools, such as 'DESeq2', by annotating results with predicted seed matches. Using publicly available data, we demonstrate the ability of 'SeedMatchR' to detect cumulative changes in differential gene expression attributed to siRNA seed regions.
Maintained by Tareian Cazares. Last updated 1 years ago.
deseq2-analysismirnarna-seqsirnatranscriptomics
20.5 match 7 stars 4.54 score 7 scriptskoheiw
LSX:Semi-Supervised Algorithm for Document Scaling
A word embeddings-based semi-supervised model for document scaling Watanabe (2020) <doi:10.1080/19312458.2020.1832976>. LSS allows users to analyze large and complex corpora on arbitrary dimensions with seed words exploiting efficiency of word embeddings (SVD, Glove). It can generate word vectors on a users-provided corpus or incorporate a pre-trained word vectors.
Maintained by Kohei Watanabe. Last updated 2 months ago.
lsaquantedasentiment-analysistext-analysis
12.9 match 55 stars 6.09 score 14 scriptsefernandezpascual
seedr:Hydro and Thermal Time Seed Germination Models in R
Analysis of seed germination data using the physiological time modelling approach. Includes functions to fit hydrotime and thermal-time models with the traditional approaches of Bradford (1990) <doi:10.1104/pp.94.2.840> and Garcia-Huidobro (1982) <doi:10.1093/jxb/33.2.288>. Allows to fit models to grouped datasets, i.e. datasets containing multiple species, seedlots or experiments.
Maintained by Fernรกndez-Pascual Eduardo. Last updated 4 years ago.
agronomybotanyecologygerminationseeds
19.6 match 2 stars 4.00 score 6 scriptscoatless-rpkg
sitmo:Parallel Pseudo Random Number Generator (PPRNG) 'sitmo' Header Files
Provided within are two high quality and fast PPRNGs that may be used in an 'OpenMP' parallel environment. In addition, there is a generator for one dimensional low-discrepancy sequence. The objective of this library to consolidate the distribution of the 'sitmo' (C++98 & C++11), 'threefry' and 'vandercorput' (C++11-only) engines on CRAN by enabling others to link to the header files inside of 'sitmo' instead of including a copy of each engine within their individual package. Lastly, the package contains example implementations using the 'sitmo' package and three accompanying vignette that provide additional information.
Maintained by James Balamuta. Last updated 1 years ago.
parallelrandom-generationrcppcppopenmp
8.0 match 7 stars 9.75 score 15 scripts 201 dependentsnlmixr2
rxode2:Facilities for Simulating from ODE-Based Models
Facilities for running simulations from ordinary differential equation ('ODE') models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the "R Administration and Installation" manual. Also the code is mostly released under GPL. The 'VODE' and 'LSODA' are in the public domain. The information is available in the inst/COPYRIGHTS.
Maintained by Matthew L. Fidler. Last updated 1 months ago.
6.9 match 40 stars 11.24 score 220 scripts 13 dependentsemf-creaf
indicspecies:Relationship Between Species and Groups of Sites
Functions to assess the strength and statistical significance of the relationship between species occurrence/abundance and groups of sites [De Caceres & Legendre (2009) <doi:10.1890/08-1823.1>]. Also includes functions to measure species niche breadth using resource categories [De Caceres et al. (2011) <doi:10.1111/J.1600-0706.2011.19679.x>].
Maintained by Miquel De Cรกceres. Last updated 27 days ago.
7.5 match 10 stars 9.49 score 386 scripts 4 dependentsdaqana
dqrng:Fast Pseudo Random Number Generators
Several fast random number generators are provided as C++ header only libraries: The PCG family by O'Neill (2014 <https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf>) as well as the Xoroshiro / Xoshiro family by Blackman and Vigna (2021 <doi:10.1145/3460772>). In addition fast functions for generating random numbers according to a uniform, normal and exponential distribution are included. The latter two use the Ziggurat algorithm originally proposed by Marsaglia and Tsang (2000, <doi:10.18637/jss.v005.i08>). The fast sampling methods support unweighted sampling both with and without replacement. These functions are exported to R and as a C++ interface and are enabled for use with the default 64 bit generator from the PCG family, Xoroshiro128+/++/** and Xoshiro256+/++/** as well as the 64 bit version of the 20 rounds Threefry engine (Salmon et al., 2011, <doi:10.1145/2063384.2063405>) as provided by the package 'sitmo'.
Maintained by Ralf Stubner. Last updated 6 months ago.
randomrandom-distributionsrandom-generationrandom-samplingrngcpp
5.4 match 42 stars 13.12 score 188 scripts 183 dependentshandcock
RDS:Respondent-Driven Sampling
Provides functionality for carrying out estimation with data collected using Respondent-Driven Sampling. This includes Heckathorn's RDS-I and RDS-II estimators as well as Gile's Sequential Sampling estimator. The package is part of the "RDS Analyst" suite of packages for the analysis of respondent-driven sampling data. See Gile and Handcock (2010) <doi:10.1111/j.1467-9531.2010.01223.x>, Gile and Handcock (2015) <doi:10.1111/rssa.12091> and Gile, Beaudry, Handcock and Ott (2018) <doi:10.1146/annurev-statistics-031017-100704>.
Maintained by Mark S. Handcock. Last updated 6 months ago.
17.9 match 1 stars 3.87 score 82 scripts 3 dependentssignaturescience
rplanes:Plausibility Analysis of Epidemiological Signals
Provides functionality to prepare data and analyze plausibility of both forecasted and reported epidemiological signals. The functions implement a set of plausibility algorithms that are agnostic to geographic and time resolutions and are calculated independently then presented as a combined score.
Maintained by VP Nagraj. Last updated 8 months ago.
11.4 match 9 stars 6.03 score 7 scriptsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurรฉlie Siberchicot. Last updated 15 days ago.
4.5 match 39 stars 14.96 score 2.2k scripts 256 dependentsonofriandreapg
drcSeedGerm:Utilities for Data Analyses in Seed Germination/Emergence Assays
Utility functions to be used to analyse datasets obtained from seed germination/emergence assays. Fits several types of seed germination/emergence models, including those reported in Onofri et al. (2018) "Hydrothermal-time-to-event models for seed germination", European Journal of Agronomy, 101, 129-139 <doi:10.1016/j.eja.2018.08.011>. Contains several datasets for practicing.
Maintained by Andrea Onofri. Last updated 2 months ago.
nonlinear-regressionseed-germination-assaystime-to-event
15.5 match 5 stars 3.97 score 37 scriptsdpc10ster
RJafroc:Artificial Intelligence Systems and Observer Performance
Analyzing the performance of artificial intelligence (AI) systems/algorithms characterized by a 'search-and-report' strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where lesion localization information is used. A book using the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: <https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840>. Online updates to this book, which use the software, are at <https://dpc10ster.github.io/RJafrocQuickStart/>, <https://dpc10ster.github.io/RJafrocRocBook/> and at <https://dpc10ster.github.io/RJafrocFrocBook/>. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict 'proper' ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, <https://github.com/dpc10ster/WindowsJafroc>. Package functions are organized as follows. Data file related function names are preceded by 'Df', curve fitting functions by 'Fit', included data sets by 'dataset', plotting functions by 'Plot', significance testing functions by 'St', sample size related functions by 'Ss', data simulation functions by 'Simulate' and utility functions by 'Util'. Implemented are figures of merit (FOMs) for quantifying performance and functions for visualizing empirical or fitted operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via either Dorfman-Berbaum-Metz or the Obuchowski-Rockette methods. Also implemented is single modality analysis, which allows comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed modality factors and the aim is to determined performance in each modality factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification.
Maintained by Dev Chakraborty. Last updated 5 months ago.
ai-optimizationartificial-intelligence-algorithmscomputer-aided-diagnosisfroc-analysisroc-analysistarget-classificationtarget-localizationcpp
10.3 match 19 stars 5.69 score 65 scriptsphilchalmers
SimDesign:Structure for Organizing Monte Carlo Simulation Designs
Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.
Maintained by Phil Chalmers. Last updated 2 days ago.
monte-carlo-simulationsimulationsimulation-framework
4.4 match 62 stars 13.38 score 253 scripts 46 dependentsrstudio
tensorflow:R Interface to 'TensorFlow'
Interface to 'TensorFlow' <https://www.tensorflow.org/>, an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more 'CPUs' or 'GPUs' in a desktop, server, or mobile device with a single 'API'. 'TensorFlow' was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.
Maintained by Tomasz Kalinowski. Last updated 19 days ago.
3.8 match 1.3k stars 15.35 score 3.2k scripts 74 dependentsalenxav
NAM:Nested Association Mapping
Designed for association studies in nested association mapping (NAM) panels, experimental and random panels. The method is described by Xavier et al. (2015) <doi:10.1093/bioinformatics/btv448>. It includes tools for genome-wide associations of multiple populations, marker quality control, population genetics analysis, genome-wide prediction, solving mixed models and finding variance components through likelihood and Bayesian methods.
Maintained by Alencar Xavier. Last updated 5 years ago.
10.0 match 2 stars 5.72 score 44 scripts 1 dependentssnoweye
pbdMPI:R Interface to MPI for HPC Clusters (Programming with Big Data Project)
A simplified, efficient, interface to MPI for HPC clusters. It is a derivation and rethinking of the Rmpi package. pbdMPI embraces the prevalent parallel programming style on HPC clusters. Beyond the interface, a collection of functions for global work with distributed data and resource-independent RNG reproducibility is included. It is based on S4 classes and methods.
Maintained by Wei-Chen Chen. Last updated 6 months ago.
8.0 match 2 stars 7.11 score 179 scripts 3 dependentsbbolker
emdbook:Support Functions and Data for "Ecological Models and Data"
Auxiliary functions and data sets for "Ecological Models and Data", a book presenting maximum likelihood estimation and related topics for ecologists (ISBN 978-0-691-12522-0).
Maintained by Ben Bolker. Last updated 8 months ago.
6.9 match 4 stars 8.04 score 656 scripts 21 dependentsstats-uoa
s20x:Functions for University of Auckland Course STATS 201/208 Data Analysis
A set of functions used in teaching STATS 201/208 Data Analysis at the University of Auckland. The functions are designed to make parts of R more accessible to a large undergraduate population who are mostly not statistics majors.
Maintained by James Curran. Last updated 2 years ago.
8.3 match 3 stars 6.40 score 211 scripts 3 dependentspik-piam
magpie4:MAgPIE outputs R package for MAgPIE version 4.x
Common output routines for extracting results from the MAgPIE framework (versions 4.x).
Maintained by Benjamin Leon Bodirsky. Last updated 2 days ago.
6.6 match 2 stars 7.88 score 254 scripts 9 dependentsyiluheihei
RevEcoR:Reverse Ecology Analysis on Microbiome
An implementation of the reverse ecology framework. Reverse ecology refers to the use of genomics to study ecology with no a priori assumptions about the organism(s) under consideration, linking organisms to their environment. It allows researchers to reconstruct the metabolic networks and study the ecology of poorly characterized microbial species from their genomic information, and has substantial potentials for microbial community ecological analysis.
Maintained by Yang Cao. Last updated 6 years ago.
8.9 match 6 stars 5.77 score 22 scripts 1 dependentsrenozao
rngtools:Utility Functions for Working with Random Number Generators
Provides a set of functions for working with Random Number Generators (RNGs). In particular, a generic S4 framework is defined for getting/setting the current RNG, or RNG data that are embedded into objects for reproducibility. Notably, convenient default methods greatly facilitate the way current RNG settings can be changed.
Maintained by Renaud Gaujoux. Last updated 3 years ago.
5.8 match 6 stars 8.65 score 85 scripts 216 dependentshdhshowalter
SeedMaker:Generate a Collection of Seeds from a Single Seed
A mechanism for easily generating and organizing a collection of seeds from a single seed, which may be subsequently used to ensure reproducibility in processes/pipelines that utilize multiple random components (e.g., trial simulation).
Maintained by Hollins Showalter. Last updated 2 months ago.
13.0 match 3.70 scorerstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 3 days ago.
3.3 match 845 stars 13.60 score 264 scripts 2 dependentsonofriandreapg
drcte:Statistical Approaches for Time-to-Event Data in Agriculture
A specific and comprehensive framework for the analyses of time-to-event data in agriculture. Fit non-parametric and parametric time-to-event models. Compare time-to-event curves for different experimental groups. Plots and other displays. It is particularly tailored to the analyses of data from germination and emergence assays. The methods are described in Onofri et al. (2022) "A unified framework for the analysis of germination, emergence, and other time-to-event data in weed science", Weed Science, 70, 259-271 <doi:10.1017/wsc.2022.8>.
Maintained by Andrea Onofri. Last updated 1 years ago.
non-linear-regressionseed-germinationtime-to-event
10.9 match 4.07 score 39 scripts 2 dependentscalvagone
campsis:Generic PK/PD Simulation Platform CAMPSIS
A generic, easy-to-use and intuitive pharmacokinetic/pharmacodynamic (PK/PD) simulation platform based on R packages 'rxode2' and 'mrgsolve'. CAMPSIS provides an abstraction layer over the underlying processes of writing a PK/PD model, assembling a custom dataset and running a simulation. CAMPSIS has a strong dependency to the R package 'campsismod', which allows to read/write a model from/to files and adapt it further on the fly in the R environment. Package 'campsis' allows the user to assemble a dataset in an intuitive manner. Once the userโs dataset is ready, the package is in charge of preparing the simulation, calling 'rxode2' or 'mrgsolve' (at the user's choice) and returning the results, for the given model, dataset and desired simulation settings.
Maintained by Nicolas Luyckx. Last updated 1 months ago.
5.7 match 8 stars 7.52 score 93 scriptsbioc
GDSArray:Representing GDS files as array-like objects
GDS files are widely used to represent genotyping or sequence data. The GDSArray package implements the `GDSArray` class to represent nodes in GDS files in a matrix-like representation that allows easy manipulation (e.g., subsetting, mathematical transformation) in _R_. The data remains on disk until needed, so that very large files can be processed.
Maintained by Xiuwen Zheng. Last updated 16 hours ago.
infrastructuredatarepresentationsequencinggenotypingarray
6.3 match 5 stars 6.78 score 8 scripts 2 dependentsalenxav
bWGR:Bayesian Whole-Genome Regression
Whole-genome regression methods on Bayesian framework fitted via EM or Gibbs sampling, single step (<doi:10.1534/g3.119.400728>), univariate and multivariate (<doi:10.1186/s12711-022-00730-w>, <doi:10.1093/genetics/iyae179>), with optional kernel term and sampling techniques (<doi:10.1186/s12859-017-1582-3>).
Maintained by Alencar Xavier. Last updated 3 months ago.
10.0 match 7 stars 4.24 score 16 scriptsyihui
knitr:A General-Purpose Package for Dynamic Report Generation in R
Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.
Maintained by Yihui Xie. Last updated 3 days ago.
dynamic-documentsknitrliterate-programmingrmarkdownsweave
1.8 match 2.4k stars 23.61 score 116k scripts 4.2k dependentsr-lib
withr:Run Code 'With' Temporarily Modified Global State
A set of functions to run code 'with' safely and temporarily modified global state. Many of these functions were originally a part of the 'devtools' package, this provides a simple package with limited dependencies to provide access to these functions.
Maintained by Lionel Henry. Last updated 22 days ago.
2.3 match 176 stars 17.92 score 1.2k scripts 12k dependentsbioc
chihaya:Save Delayed Operations to a HDF5 File
Saves the delayed operations of a DelayedArray to a HDF5 file. This enables efficient recovery of the DelayedArray's contents in other languages and analysis frameworks.
Maintained by Aaron Lun. Last updated 5 months ago.
dataimportdatarepresentationzlibcpp
9.1 match 4.38 score 16 scriptsmrc-ide
dust:Iterate Multiple Realisations of Stochastic Models
An Engine for simulation of stochastic models. Includes support for running stochastic models in parallel, either with shared or varying parameters. Simulations are run efficiently in compiled code and can be run with a fraction of simulated states returned to R, allowing control over memory usage. Support is provided for building bootstrap particle filter for performing Sequential Monte Carlo (e.g., Gordon et al. 1993 <doi:10.1049/ip-f-2.1993.0015>). The core of the simulation engine is the 'xoshiro256**' algorithm (Blackman and Vigna <arXiv:1805.01407>), and the package is further described in FitzJohn et al 2021 <doi:10.12688/wellcomeopenres.16466.2>.
Maintained by Rich FitzJohn. Last updated 6 months ago.
4.9 match 18 stars 7.84 score 60 scripts 3 dependentsrstudio
reticulate:Interface to 'Python'
Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.
Maintained by Tomasz Kalinowski. Last updated 1 days ago.
1.8 match 1.7k stars 21.05 score 18k scripts 432 dependentspachadotdev
analogsea:Interface to 'DigitalOcean'
Provides a set of functions for interacting with the 'DigitalOcean' API <https://www.digitalocean.com/>, including creating images, destroying them, rebooting, getting details on regions, and available images.
Maintained by Mauricio Vargas. Last updated 2 years ago.
5.0 match 159 stars 7.56 score 100 scripts 1 dependentsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Mรผller. Last updated 17 hours ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
1.8 match 583 stars 21.10 score 31k scripts 1.9k dependentsbioc
XDE:XDE: a Bayesian hierarchical model for cross-study analysis of differential gene expression
Multi-level model for cross-study detection of differential gene expression.
Maintained by Robert Scharpf. Last updated 5 months ago.
microarraydifferentialexpressioncpp
8.6 match 4.20 score 10 scriptsr-forge
distrSim:Simulation Classes Based on Package 'distr'
S4-classes for setting up a coherent framework for simulation within the distr family of packages.
Maintained by Peter Ruckdeschel. Last updated 2 months ago.
8.1 match 4.16 score 7 scripts 3 dependentsramnathv
htmlwidgets:HTML Widgets for R
A framework for creating HTML widgets that render in various contexts including the R console, 'R Markdown' documents, and 'Shiny' web applications.
Maintained by Carson Sievert. Last updated 1 years ago.
1.8 match 791 stars 19.05 score 7.4k scripts 3.1k dependentsopenpharma
crmPack:Object-Oriented Implementation of CRM Designs
Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules. Further details are presented in Sabanes Bove et al. (2019) <doi:10.18637/jss.v089.i10>.
Maintained by Daniel Sabanes Bove. Last updated 2 months ago.
4.2 match 21 stars 7.79 score 208 scriptsalarm-redist
redist:Simulation Methods for Legislative Redistricting
Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.
Maintained by Christopher T. Kenny. Last updated 2 months ago.
geospatialgerrymanderingredistrictingsamplingopenblascppopenmp
3.5 match 68 stars 9.17 score 259 scriptsr-forge
setRNG:Set (Normal) Random Number Generator and Seed
Provides utilities to help set and record the setting of the seed and the uniform and normal generators used when a random experiment is run. The utilities can be used in other functions that do random experiments to simplify recording and/or setting all the necessary information for reproducibility. See the vignette and reference manual for examples.
Maintained by Paul Gilbert. Last updated 2 months ago.
5.3 match 5.92 score 17 scripts 5 dependentsbioc
Rsubread:Mapping, quantification and variant analysis of sequencing data
Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.
Maintained by Wei Shi. Last updated 2 days ago.
sequencingalignmentsequencematchingrnaseqchipseqsinglecellgeneexpressiongeneregulationgeneticsimmunooncologysnpgeneticvariabilitypreprocessingqualitycontrolgenomeannotationgenefusiondetectionindeldetectionvariantannotationvariantdetectionmultiplesequencealignmentzlib
3.4 match 9.24 score 892 scripts 10 dependentswjbraun
DAAG:Data Analysis and Graphics Data and Functions
Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.
Maintained by W. John Braun. Last updated 11 months ago.
3.8 match 8.25 score 1.2k scripts 1 dependentscomeetie
greed:Clustering and Model Selection with the Integrated Classification Likelihood
An ensemble of algorithms that enable the clustering of networks and data matrices (such as counts, categorical or continuous) with different type of generative models. Model selection and clustering is performed in combination by optimizing the Integrated Classification Likelihood (which is equivalent to minimizing the description length). Several models are available such as: Stochastic Block Model, degree corrected Stochastic Block Model, Mixtures of Multinomial, Latent Block Model. The optimization is performed thanks to a combination of greedy local search and a genetic algorithm (see <arXiv:2002:11577> for more details).
Maintained by Etienne Cรดme. Last updated 2 years ago.
5.2 match 14 stars 5.94 score 41 scriptsealedo
orthGS:Orthology vs Paralogy Relationships among Glutamine Synthetase from Plants
Tools to analyze and infer orthology and paralogy relationships between glutamine synthetase proteins in seed plants.
Maintained by Elena Aledo. Last updated 4 months ago.
7.9 match 3.80 score 21 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
3.6 match 3 stars 8.20 score 7.8k scripts 11 dependentsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 9 days ago.
1.8 match 520 stars 16.52 score 1.4k scripts 38 dependentskoheiw
seededlda:Seeded Sequential LDA for Topic Modeling
Seeded Sequential LDA can classify sentences of texts into pre-define topics with a small number of seed words (Watanabe & Baturo, 2023) <doi:10.1177/08944393231178605>. Implements Seeded LDA (Lu et al., 2010) <doi:10.1109/ICDMW.2011.125> and Sequential LDA (Du et al., 2012) <doi:10.1007/s10115-011-0425-1> with the distributed LDA algorithm (Newman, et al., 2009) for parallel computing.
Maintained by Kohei Watanabe. Last updated 2 months ago.
semi-supervised-learningtext-classificationonetbbcpp
3.9 match 75 stars 7.38 score 177 scripts 1 dependentsr-forge
Sleuth3:Data Sets from Ramsey and Schafer's "Statistical Sleuth (3rd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2013), "The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)", Cengage Learning.
Maintained by Berwin A Turlach. Last updated 1 years ago.
4.5 match 6.38 score 522 scriptshturner
BradleyTerry2:Bradley-Terry Models
Specify and fit the Bradley-Terry model, including structured versions in which the parameters are related to explanatory variables through a linear predictor and versions with contest-specific effects, such as a home advantage.
Maintained by Heather Turner. Last updated 6 years ago.
bradley-terry-modelspaired-comparisonsstatistical-models
3.6 match 20 stars 7.97 score 172 scripts 1 dependentshanjunwei-lab
ssMutPA:Single-Sample Mutation-Based Pathway Analysis
A systematic bioinformatics tool to perform single-sample mutation-based pathway analysis by integrating somatic mutation data with the Protein-Protein Interaction (PPI) network. In this method, we use local and global weighted strategies to evaluate the effects of network genes from mutations according to the network topology and then calculate the mutation-based pathway enrichment score (ssMutPES) to reflect the accumulated effect of mutations of each pathway. Subsequently, the ssMutPES profiles are used for unsupervised spectral clustering to identify cancer subtypes.
Maintained by Junwei Han. Last updated 5 months ago.
7.0 match 4.00 score 9 scriptsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 9 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
3.5 match 33 stars 7.77 score 10 scriptsthothorn
HSAUR3:A Handbook of Statistical Analyses Using R (3rd Edition)
Functions, data sets, analyses and examples from the third edition of the book ''A Handbook of Statistical Analyses Using R'' (Torsten Hothorn and Brian S. Everitt, Chapman & Hall/CRC, 2014). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available. In addition, Sweave source code for slides of selected chapters is included in this package (see HSAUR3/inst/slides). The publishers web page is '<https://www.routledge.com/A-Handbook-of-Statistical-Analyses-using-R/Hothorn-Everitt/p/book/9781482204582>'.
Maintained by Torsten Hothorn. Last updated 7 months ago.
4.0 match 6 stars 6.72 score 120 scripts 2 dependentsmaximilianaxer
quaxnat:Estimation of Natural Regeneration Potential
Functions for estimating the potential dispersal of tree species using regeneration densities and dispersal distances to nearest seed trees. A quantile regression is implemented to determine the dispersal potential. Spatial prediction can be used to identify natural regeneration potential for forest restoration as described in Axer et al (2021) <doi:10.1016/j.foreco.2020.118802>.
Maintained by Maximilian Axer. Last updated 5 months ago.
7.3 match 1 stars 3.54 score 2 scriptsr-forge
Sleuth2:Data Sets from Ramsey and Schafer's "Statistical Sleuth (2nd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2002), "The Statistical Sleuth: A Course in Methods of Data Analysis (2nd ed)", Duxbury.
Maintained by Berwin A Turlach. Last updated 1 years ago.
4.5 match 5.70 score 191 scriptsmomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
3.4 match 51 stars 7.42 score 346 scriptsjeffreyracine
np:Nonparametric Kernel Smoothing Methods for Mixed Data Types
Nonparametric (and semiparametric) kernel methods that seamlessly handle a mix of continuous, unordered, and ordered factor data types. We would like to gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC, <https://www.nserc-crsng.gc.ca/>), the Social Sciences and Humanities Research Council of Canada (SSHRC, <https://www.sshrc-crsh.gc.ca/>), and the Shared Hierarchical Academic Research Computing Network (SHARCNET, <https://sharcnet.ca/>). We would also like to acknowledge the contributions of the GNU GSL authors. In particular, we adapt the GNU GSL B-spline routine gsl_bspline.c adding automated support for quantile knots (in addition to uniform knots), providing missing functionality for derivatives, and for extending the splines beyond their endpoints.
Maintained by Jeffrey S. Racine. Last updated 1 months ago.
2.0 match 49 stars 12.64 score 672 scripts 44 dependentsbertcarnell
lhs:Latin Hypercube Samples
Provides a number of methods for creating and augmenting Latin Hypercube Samples and Orthogonal Array Latin Hypercube Samples.
Maintained by Rob Carnell. Last updated 9 months ago.
latin-hypercubelatin-hypercube-samplelatin-hypercube-samplinglhsorthogonal-arrayscpp
1.8 match 44 stars 13.95 score 1.5k scripts 108 dependentsstan-dev
rstan:R Interface to Stan
User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.
Maintained by Ben Goodrich. Last updated 5 hours ago.
bayesian-data-analysisbayesian-inferencebayesian-statisticsmcmcstancpp
1.3 match 1.1k stars 18.76 score 14k scripts 280 dependentsmarie-perrotdockes
MultiVarSel:Variable Selection in a Multivariate Linear Model
It performs variable selection in a multivariate linear model by estimating the covariance matrix of the residuals then use it to remove the dependence that may exist among the responses and eventually performs variable selection by using the Lasso criterion. The method is described in the paper Perrot-Dockรจs et al. (2017) <arXiv:1704.00076>.
Maintained by Marie Perrot-Dockรจs. Last updated 6 years ago.
12.5 match 2.00 score 8 scriptsbioc
rfaRm:An R interface to the Rfam database
rfaRm provides a client interface to the Rfam database of RNA families. Data that can be retrieved include RNA families, secondary structure images, covariance models, sequences within each family, alignments leading to the identification of a family and secondary structures in the dot-bracket format.
Maintained by Lara Selles Vidal. Last updated 5 months ago.
functionalgenomicsdataimportthirdpartyclientvisualizationmultiplesequencealignment
7.5 match 3.30 score 1 scriptsweecology
LDATS:Latent Dirichlet Allocation Coupled with Time Series Analyses
Combines Latent Dirichlet Allocation (LDA) and Bayesian multinomial time series methods in a two-stage analysis to quantify dynamics in high-dimensional temporal data. LDA decomposes multivariate data into lower-dimension latent groupings, whose relative proportions are modeled using generalized Bayesian time series models that include abrupt changepoints and smooth dynamics. The methods are described in Blei et al. (2003) <doi:10.1162/jmlr.2003.3.4-5.993>, Western and Kleykamp (2004) <doi:10.1093/pan/mph023>, Venables and Ripley (2002, ISBN-13:978-0387954578), and Christensen et al. (2018) <doi:10.1002/ecy.2373>.
Maintained by Juniper L. Simonis. Last updated 5 years ago.
changepointldaparallel-temperingportalsoftmax
3.5 match 25 stars 6.93 score 45 scriptsthothorn
HSAUR:A Handbook of Statistical Analyses Using R (1st Edition)
Functions, data sets, analyses and examples from the book ''A Handbook of Statistical Analyses Using R'' (Brian S. Everitt and Torsten Hothorn, Chapman & Hall/CRC, 2006). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available.
Maintained by Torsten Hothorn. Last updated 3 years ago.
4.0 match 6.07 score 253 scripts 5 dependentsprojectmosaic
mosaic:Project MOSAIC Statistics and Mathematics Teaching Utilities
Data sets and utilities from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.
Maintained by Randall Pruim. Last updated 1 years ago.
1.8 match 93 stars 13.32 score 7.2k scripts 7 dependentsskranz
RTutor:Interactive R problem sets with automatic testing of solutions and automatic hints
Interactive R problem sets with automatic testing of solutions and automatic hints
Maintained by Sebastian Kranz. Last updated 1 years ago.
economicslearn-to-codeproblem-setrstudiortutorshinyteaching
4.0 match 205 stars 5.83 score 111 scripts 1 dependentspeterkdunn
GLMsData:Generalized Linear Model Data Sets
Data sets from the book Generalized Linear Models with Examples in R by Dunn and Smyth.
Maintained by Peter K. Dunn. Last updated 3 years ago.
9.0 match 2.61 score 220 scriptsremkoduursma
lgrdata:Example Datasets for a Learning Guide to R
A largish collection of example datasets, including several classics. Many of these datasets are well suited for regression, classification, and visualization.
Maintained by Remko Duursma. Last updated 6 years ago.
7.5 match 3.11 score 26 scriptscelevitz
touRnamentofchampions:Tournament of Champions Data
Several datasets which describe the challenges and results of competitions in Tournament of Champions. This data is useful for practicing data wrangling, graphing, and analyzing how each season of Tournament of Champions played out.
Maintained by Levitz Carly. Last updated 10 days ago.
6.0 match 3.70 scorehankstevens
primer:Functions and Data for the Book, a Primer of Ecology with R
Functions are primarily functions for systems of ordinary differential equations, difference equations, and eigenanalysis and projection of demographic matrices; data are for examples.
Maintained by Hank Stevens. Last updated 4 years ago.
3.8 match 14 stars 5.92 score 118 scriptsthothorn
HSAUR2:A Handbook of Statistical Analyses Using R (2nd Edition)
Functions, data sets, analyses and examples from the second edition of the book ''A Handbook of Statistical Analyses Using R'' (Brian S. Everitt and Torsten Hothorn, Chapman & Hall/CRC, 2008). The first chapter of the book, which is entitled ''An Introduction to R'', is completely included in this package, for all other chapters, a vignette containing all data analyses is available. In addition, the package contains Sweave code for producing slides for selected chapters (see HSAUR2/inst/slides).
Maintained by Torsten Hothorn. Last updated 2 years ago.
4.0 match 5.51 score 181 scripts 1 dependentsbioc
BiocParallel:Bioconductor facilities for parallel evaluation
This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.
Maintained by Martin Morgan. Last updated 29 days ago.
infrastructurebioconductor-packagecore-packageu24ca289073cpp
1.3 match 67 stars 17.40 score 7.3k scripts 1.1k dependentsmartynplummer
rjags:Bayesian Graphical Models using MCMC
Interface to the JAGS MCMC library.
Maintained by Martyn Plummer. Last updated 7 months ago.
2.3 match 7 stars 9.60 score 4.0k scripts 165 dependentsbioc
GeDi:Defining and visualizing the distances between different genesets
The package provides different distances measurements to calculate the difference between genesets. Based on these scores the genesets are clustered and visualized as graph. This is all presented in an interactive Shiny application for easy usage.
Maintained by Annekathrin Nedwed. Last updated 5 months ago.
guigenesetenrichmentsoftwaretranscriptionrnaseqvisualizationclusteringpathwaysreportwritinggokeggreactomeshinyapps
3.9 match 1 stars 5.52 score 22 scriptsstan-dev
loo:Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models
Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.
Maintained by Jonah Gabry. Last updated 6 days ago.
bayesbayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticscross-validationinformation-criterionmodel-comparisonstan
1.2 match 152 stars 17.30 score 2.6k scripts 297 dependentsmlr-org
mlr3misc:Helper Functions for 'mlr3'
Frequently used helper functions and assertions used in 'mlr3' and its companion packages. Comes with helper functions for functional programming, for printing, to work with 'data.table', as well as some generally useful 'R6' classes. This package also supersedes the package 'BBmisc'.
Maintained by Marc Becker. Last updated 4 months ago.
machine-learningmiscellaneousmlr3
2.0 match 12 stars 10.28 score 302 scripts 42 dependentsmurrayefford
secr:Spatially Explicit Capture-Recapture
Functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
Maintained by Murray Efford. Last updated 2 days ago.
2.0 match 3 stars 10.13 score 410 scripts 5 dependentsf-rousset
spaMM:Mixed-Effect Models, with or without Spatial Random Effects
Inference based on models with or without spatially-correlated random effects, multivariate responses, or non-Gaussian random effects (e.g., Beta). Variation in residual variance (heteroscedasticity) can itself be represented by a mixed-effect model. Both classical geostatistical models (Rousset and Ferdy 2014 <doi:10.1111/ecog.00566>), and Markov random field models on irregular grids (as considered in the 'INLA' package, <https://www.r-inla.org>), can be fitted, with distinct computational procedures exploiting the sparse matrix representations for the latter case and other autoregressive models. Laplace approximations are used for likelihood or restricted likelihood. Penalized quasi-likelihood and other variants discussed in the h-likelihood literature (Lee and Nelder 2001 <doi:10.1093/biomet/88.4.987>) are also implemented.
Maintained by Franรงois Rousset. Last updated 9 months ago.
4.0 match 4.94 score 208 scripts 5 dependentslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 27 days ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
2.0 match 233 stars 9.84 score 185 scripts 1 dependentskoheiw
wordmap:Feature Extraction and Document Classification with Noisy Labels
Extract features and classify documents with noisy labels given by document-meta data or keyword matching Watanabe & Zhou (2020) <doi:10.1177/0894439320907027>.
Maintained by Kohei Watanabe. Last updated 2 months ago.
4.0 match 2 stars 4.86 score 1 scriptsropensci
drake:A Pipeline Toolkit for Reproducible Computation at Scale
A general-purpose computational engine for data analysis, drake rebuilds intermediate data objects when their dependencies change, and it skips work when the results are already up to date. Not every execution starts from scratch, there is native support for parallel and distributed computing, and completed projects have tangible evidence that they are reproducible. Extensive documentation, from beginner-friendly tutorials to practical examples and more, is available at the reference website <https://docs.ropensci.org/drake/> and the online manual <https://books.ropensci.org/drake/>.
Maintained by William Michael Landau. Last updated 4 months ago.
data-sciencedrakehigh-performance-computingmakefilepeer-reviewedpipelinereproducibilityreproducible-researchropensciworkflow
1.7 match 1.3k stars 11.49 score 1.7k scripts 1 dependentsmandymejia
BayesfMRI:Spatial Bayesian Methods for Task Functional MRI Studies
Performs a spatial Bayesian general linear model (GLM) for task functional magnetic resonance imaging (fMRI) data on the cortical surface. Additional models include group analysis and inference to detect thresholded areas of activation. Includes direct support for the 'CIFTI' neuroimaging file format. For more information see A. F. Mejia, Y. R. Yue, D. Bolin, F. Lindgren, M. A. Lindquist (2020) <doi:10.1080/01621459.2019.1611582> and D. Spencer, Y. R. Yue, D. Bolin, S. Ryan, A. F. Mejia (2022) <doi:10.1016/j.neuroimage.2022.118908>.
Maintained by Amanda Mejia. Last updated 11 days ago.
3.3 match 26 stars 5.77 score 19 scriptsemf-creaf
medfateland:Mediterranean Landscape Simulation
Simulate forest hydrology, forest function and dynamics over landscapes [De Caceres et al. (2015) <doi:10.1016/j.agrformet.2015.06.012>]. Parallelization is allowed in several simulation functions and simulations may be conducted including spatial processes such as lateral water transfer and seed dispersal.
Maintained by Miquel De Cรกceres. Last updated 28 days ago.
3.5 match 5 stars 5.41 score 41 scriptsbraverock
PortfolioAnalytics:Portfolio Analysis, Including Numerical Methods for Optimization of Portfolios
Portfolio optimization and analysis routines and graphics.
Maintained by Brian G. Peterson. Last updated 4 months ago.
1.6 match 81 stars 11.49 score 626 scripts 2 dependentsjhmaindonald
hwde:Models and Tests for Departure from Hardy-Weinberg Equilibrium and Independence Between Loci
Fits models for genotypic disequilibria, as described in Huttley and Wilson (2000) <doi:10.1093/genetics/156.4.2127>, Weir (1996) and Weir and Wilson (1986). Contrast terms are available that account for first order interactions between loci. Also implements, for a single locus in a single population, a conditional exact test for Hardy-Weinberg equilibrium.
Maintained by John Maindonald. Last updated 2 years ago.
4.9 match 3.74 score 11 scriptsagronomiar
seedreg:Regression Analysis for Seed Germination as a Function of Temperature
Regression analysis using common models in seed temperature studies, such as the Gaussian model (Martins, JF, Barroso, AAM, & Alves, PLCA (2017) <doi:10.1590/s0100-83582017350100039>), quadratic (Nunes, AL, Sossmeier, S, Gotz, AP, & Bispo, NB (2018) <doi: 10.17265/2161-6264/2018.06.002>) and others with potential for use, such as those implemented in the 'drc' package (Ritz, C, Baty, F, Streibig, JC, & Gerhard, D (2015). <doi:10.1371/journal.pone.0146021>), in the estimation of the ideal and cardinal temperature for the occurrence of plant seed germination. The functions return graphs with the equations automatically.
Maintained by Gabriel Danilo Shimizu. Last updated 3 years ago.
8.9 match 2.04 score 22 scriptsmikemeredith
wiqid:Quick and Dirty Estimates for Wildlife Populations
Provides simple, fast functions for maximum likelihood and Bayesian estimates of wildlife population parameters, suitable for use with simulated data or bootstraps. Early versions were indeed quick and dirty, but optional error-checking routines and meaningful error messages have been added. Includes single and multi-season occupancy, closed capture population estimation, survival, species richness and distance measures.
Maintained by Ngumbang Juat. Last updated 2 years ago.
3.8 match 2 stars 4.84 score 115 scripts 1 dependentspredictiveecology
reproducible:Enhance Reproducibility of R Code
A collection of high-level, machine- and OS-independent tools for making reproducible and reusable content in R. The two workhorse functions are Cache() and prepInputs(). Cache() allows for nested caching, is robust to environments and objects with environments (like functions), and deals with some classes of file-backed R objects e.g., from terra and raster packages. Both functions have been developed to be foundational components of data retrieval and processing in continuous workflow situations. In both functions, efforts are made to make the first and subsequent calls of functions have the same result, but faster at subsequent times by way of checksums and digesting. Several features are still under development, including cloud storage of cached objects allowing for sharing between users. Several advanced options are available, see ?reproducibleOptions().
Maintained by Eliot J B McIntire. Last updated 1 months ago.
reproducibilityreproducible-research
1.7 match 41 stars 10.52 score 122 scripts 15 dependentsghtaranto
scapesClassification:User-Defined Classification of Raster Surfaces
Series of algorithms to translate users' mental models of seascapes, landscapes and, more generally, of geographic features into computer representations (classifications). Spaces and geographic objects are classified with user-defined rules taking into account spatial data as well as spatial relationships among different classes and objects.
Maintained by Gerald H. Taranto. Last updated 3 years ago.
classification-algorithmobject-detectionrasterspatial
4.3 match 1 stars 4.22 score 33 scriptsnflverse
nflfastR:Functions to Efficiently Access NFL Play by Play Data
A set of functions to access National Football League play-by-play data from <https://www.nfl.com/>.
Maintained by Ben Baldwin. Last updated 2 months ago.
american-footballfootball-datanflnflstatsnflversesports-analytics
1.7 match 442 stars 10.40 score 596 scripts 3 dependentsdpmcsuss
iGraphMatch:Tools for Graph Matching
Versatile tools and data for graph matching analysis with various forms of prior information that supports working with 'igraph' objects, matrix objects, or lists of either.
Maintained by Daniel Sussman. Last updated 10 months ago.
graph-algorithmsgraph-matchingcpp
3.1 match 9 stars 5.65 score 9 scriptsflorianhartig
DHARMa:Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models
The 'DHARMa' package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. Currently supported are linear and generalized linear (mixed) models from 'lme4' (classes 'lmerMod', 'glmerMod'), 'glmmTMB', 'GLMMadaptive', and 'spaMM'; phylogenetic linear models from 'phylolm' (classes 'phylolm' and 'phyloglm'); generalized additive models ('gam' from 'mgcv'); 'glm' (including 'negbin' from 'MASS', but excluding quasi-distributions) and 'lm' model classes. Moreover, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as 'JAGS', 'STAN', or 'BUGS' can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial, phylogenetic and temporal autocorrelation.
Maintained by Florian Hartig. Last updated 15 days ago.
glmmregressionregression-diagnosticsresidual
1.2 match 226 stars 14.74 score 2.8k scripts 10 dependentsbioc
GloScope:Population-level Representation on scRNA-Seq data
This package aims at representing and summarizing the entire single-cell profile of a sample. It allows researchers to perform important bioinformatic analyses at the sample-level such as visualization and quality control. The main functions Estimate sample distribution and calculate statistical divergence among samples, and visualize the distance matrix through MDS plots.
Maintained by William Torous. Last updated 5 months ago.
datarepresentationqualitycontrolrnaseqsequencingsoftwaresinglecell
2.8 match 3 stars 6.05 score 84 scriptsmclements
microsimulation:Discrete Event Simulation in R and C++, with Tools for Cost-Effectiveness Analysis
Discrete event simulation using both R and C++ (Karlsson et al 2016; <doi:10.1109/eScience.2016.7870915>). The C++ code is adapted from the SSIM library <https://www.inf.usi.ch/carzaniga/ssim/>, allowing for event-oriented simulation. The code includes a SummaryReport class for reporting events and costs by age and other covariates. The C++ code is available as a static library for linking to other packages. A priority queue implementation is given in C++ together with an S3 closure and a reference class implementation. Finally, some tools are provided for cost-effectiveness analysis.
Maintained by Mark Clements. Last updated 7 months ago.
cppdiscrete-event-simulationhealth-economicsopenblascpp
3.9 match 37 stars 4.35 score 20 scriptsmechantrouquin
landsepi:Landscape Epidemiology and Evolution
A stochastic, spatially-explicit, demo-genetic model simulating the spread and evolution of a plant pathogen in a heterogeneous landscape to assess resistance deployment strategies. It is based on a spatial geometry for describing the landscape and allocation of different cultivars, a dispersal kernel for the dissemination of the pathogen, and a SEIR ('Susceptible-Exposed-Infectious-Removedโ) structure with a discrete time step. It provides a useful tool to assess the performance of a wide range of deployment options with respect to their epidemiological, evolutionary and economic outcomes. Loup Rimbaud, Julien Papaรฏx, Jean-Franรงois Rey, Luke G Barrett, Peter H Thrall (2018) <doi:10.1371/journal.pcbi.1006067>.
Maintained by Jean-Franรงois Rey. Last updated 6 months ago.
4.6 match 3.58 score 18 scriptsarturstat
TPmsm:Estimation of Transition Probabilities in Multistate Models
Estimation of transition probabilities for the illness-death model and or the three-state progressive model.
Maintained by Artur Araujo. Last updated 1 years ago.
illness-death-modelkaplan-meiermonte-carlo-simulationmulti-state-modelsopenmp-parallelizationsurvival-analysistransition-probabilitiesopenblasopenmp
3.7 match 1 stars 4.52 score 22 scripts 1 dependentscran
crosstalkr:Analysis of Graph-Structured Data with a Focus on Protein-Protein Interaction Networks
Provides a general toolkit for drug target identification. We include functionality to reduce large graphs to subgraphs and prioritize nodes. In addition to being optimized for use with generic graphs, we also provides support to analyze protein-protein interactions networks from online repositories. For more details on core method, refer to Weaver et al. (2021) <https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008755>.
Maintained by Davis Weaver. Last updated 10 months ago.
6.1 match 2.70 scorejpmonteagudo28
despair:Motivational Quotes and Shakespearean Bardโbits for Personal Projects
Generate motivational quotes and Shakespearean word combinations (bardโbits) that a user can consider for their personal projects. Each of the package functions takes two arguments, cat which default to any, and a a numeric or character seed to ensure reproducible results.
Maintained by JP Monteagudo. Last updated 3 months ago.
4.2 match 3 stars 3.78 score 5 scriptslaerciojuniosilva
SeedCalc:Seed Germination and Seedling Growth Indexes
Functions to calculate seed germination and seedling emergence and growth indexes. The main indexes for germination and seedling emergence, considering the time for seed germinate are: T10, T50 and T90, in Farooq et al. (2005) <10.1111/j.1744-7909.2005.00031.x>; and MGT, in Labouriau (1983). Considering the germination speed are: Germination Speed Index, in Maguire (1962), Mean Germination Rate, in Labouriau (1983); considering the homogeneity of germination are: Coefficient of Variation of the Germination Time, in Carvalho et al. (2005) <10.1590/S0100-84042005000300018>, and Variance of Germination, in Labouriau (1983); Uncertainty, in Labouriau and Valadares (1976) <ISSN:0001-3765>; and Synchrony, in Primack (1980). The main seedling indexes are Growth, in Sako (2001), Uniformity, in Sako (2001) and Castan et al. (2018) <doi:10.1590/1678-992x-2016-0401>; and Vigour, in Medeiros and Pereira (2018) <doi:10.1590/1983-40632018v4852340>.
Maintained by Laercio Junio da Silva. Last updated 6 years ago.
9.4 match 1.68 score 24 scriptsbioc
DelayedMatrixStats:Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects
A port of the 'matrixStats' API for use with DelayedMatrix objects from the 'DelayedArray' package. High-performing functions operating on rows and columns of DelayedMatrix objects, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized.
Maintained by Peter Hickey. Last updated 2 months ago.
infrastructuredatarepresentationsoftware
1.3 match 16 stars 11.86 score 211 scripts 112 dependentsarsilva87
biotools:Tools for Biometry and Applied Statistics in Agricultural Science
Tools designed to perform and evaluate cluster analysis (including Tocher's algorithm), discriminant analysis and path analysis (standard and under collinearity), as well as some useful miscellaneous tools for dealing with sample size and optimum plot size calculations. A test for seed sample heterogeneity is now available. Mantel's permutation test can be found in this package. A new approach for calculating its power is implemented. biotools also contains tests for genetic covariance components. Heuristic approaches for performing non-parametric spatial predictions of generic response variables and spatial gene diversity are implemented.
Maintained by Anderson Rodrigo da Silva. Last updated 3 years ago.
cluster-analysismultivariate-analysisstatisticstocher
2.2 match 2 stars 7.11 score 161 scripts 1 dependentsbioc
maSigPro:Significant Gene Expression Profile Differences in Time Course Gene Expression Data
maSigPro is a regression based approach to find genes for which there are significant gene expression profile differences between experimental groups in time course microarray and RNA-Seq experiments.
Maintained by Maria Jose Nueda. Last updated 5 months ago.
microarrayrna-seqdifferential expressiontimecourse
3.0 match 5.18 score 76 scriptstemp20250212
MultiTraits:Analyzing and Visualizing Multidimensional Plant Traits
Implements analytical methods for multidimensional plant traits, including Competitors-Stress tolerators-Ruderals strategy analysis using leaf traits, Leaf-Height-Seed strategy analysis, Niche Periodicity Table analysis, and Trait Network analysis. Provides functions for data analysis, visualization, and network metrics calculation. Methods are based on Grime (1974) <doi:10.1038/250026a0>, Pierce et al. (2017) <doi:10.1111/1365-2435.12882>, Westoby (1998) <doi:10.1023/A:1004327224729>, Yang et al. (2022) <doi:10.1016/j.foreco.2022.120540>, Winemiller et al. (2015) <doi:10.1111/ele.12462>, He et al. (2020) <doi:10.1016/j.tree.2020.06.003>.
Maintained by Anonymous Author. Last updated 26 days ago.
3.9 match 3.90 score 16 scriptsmlopez-ibanez
irace:Iterated Racing for Automatic Algorithm Configuration
Iterated race is an extension of the Iterated F-race method for the automatic configuration of optimization algorithms, that is, (offline) tuning their parameters by finding the most appropriate settings given a set of instances of an optimization problem. M. Lรณpez-Ibรกรฑez, J. Dubois-Lacoste, L. Pรฉrez Cรกceres, T. Stรผtzle, and M. Birattari (2016) <doi:10.1016/j.orp.2016.09.002>.
Maintained by Manuel Lรณpez-Ibรกรฑez. Last updated 1 months ago.
algorithm-configurationhyperparameter-tuningiraceoptimization-algorithms
1.6 match 63 stars 9.22 score 103 scripts 1 dependentsdynverse
dynwrap:Representing and Inferring Single-Cell Trajectories
Provides functionality to infer trajectories from single-cell data, represent them into a common format, and adapt them. Other biological information can also be added, such as cellular grouping, RNA velocity and annotation. Saelens et al. (2019) <doi:10.1038/s41587-019-0071-9>.
Maintained by Robrecht Cannoodt. Last updated 2 years ago.
2.0 match 16 stars 7.48 score 159 scripts 1 dependentschabert-liddell
robber:Using Block Model to Estimate the Robustness of Ecological Network
Implementation of a variety of methods to compute the robustness of ecological interaction networks with binary interactions as described in <doi:10.1002/env.2709>. In particular, using the Stochastic Block Model and its bipartite counterpart, the Latent Block Model to put a parametric model on the network, allows the comparison of the robustness of networks differing in species richness and number of interactions. It also deals with networks that are partially sampled and/or with missing values.
Maintained by Saint-Clair Chabert-Liddell. Last updated 1 years ago.
ecological-networkrobberrobustness
4.0 match 1 stars 3.70 score 4 scriptsnsj3
rioja:Analysis of Quaternary Science Data
Constrained clustering, transfer functions, and other methods for analysing Quaternary science data.
Maintained by Steve Juggins. Last updated 6 months ago.
2.0 match 10 stars 7.21 score 191 scripts 3 dependentsr-forge
Polychrome:Qualitative Palettes with Many Colors
Tools for creating, viewing, and assessing qualitative palettes with many (20-30 or more) colors. See Coombes and colleagues (2019) <doi:10.18637/jss.v090.c01>.
Maintained by Kevin R. Coombes. Last updated 1 months ago.
1.5 match 9.56 score 1.0k scripts 27 dependentsropensci
beastier:Call 'BEAST2'
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAST2' is a command-line tool. This package provides a way to call 'BEAST2' from an 'R' function call.
Maintained by Richรจl J.C. Bilderbeek. Last updated 25 days ago.
bayesianbeastbeast2phylogenetic-inferencephylogeneticsopenjdk
1.8 match 11 stars 7.87 score 47 scripts 4 dependentsshaunpwilkinson
kmer:Fast K-Mer Counting and Clustering for Biological Sequence Analysis
Contains tools for rapidly computing distance matrices and clustering large sequence datasets using fast alignment-free k-mer counting and recursive k-means partitioning. See Vinga and Almeida (2003) <doi:10.1093/bioinformatics/btg005> for a review of k-mer counting methods and applications for biological sequence analysis.
Maintained by Shaun Wilkinson. Last updated 6 years ago.
1.7 match 27 stars 8.24 score 71 scripts 6 dependentsbioc
alabaster.matrix:Load and Save Artifacts from File
Save matrices, arrays and similar objects into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Maintained by Aaron Lun. Last updated 14 days ago.
dataimportdatarepresentationcpp
2.0 match 7.05 score 15 scripts 8 dependentscharlie86
spotifyr:R Wrapper for the 'Spotify' Web API
An R wrapper for pulling data from the 'Spotify' Web API <https://developer.spotify.com/documentation/web-api/> in bulk, or post items on a 'Spotify' user's playlist.
Maintained by Daniel Antal. Last updated 5 months ago.
music-information-retrievalspotify
1.7 match 374 stars 8.54 score 936 scriptsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 1 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
1.8 match 105 stars 7.98 scoregertvv
hitandrun:"Hit and Run" and "Shake and Bake" for Sampling Uniformly from Convex Shapes
The "Hit and Run" Markov Chain Monte Carlo method for sampling uniformly from convex shapes defined by linear constraints, and the "Shake and Bake" method for sampling from the boundary of such shapes. Includes specialized functions for sampling normalized weights with arbitrary linear constraints. Tervonen, T., van Valkenhoef, G., Basturk, N., and Postmus, D. (2012) <doi:10.1016/j.ejor.2012.08.026>. van Valkenhoef, G., Tervonen, T., and Postmus, D. (2014) <doi:10.1016/j.ejor.2014.06.036>.
Maintained by Gert van Valkenhoef. Last updated 3 years ago.
2.0 match 16 stars 6.92 score 121 scripts 9 dependentsunuran
Runuran:R Interface to the 'UNU.RAN' Random Variate Generators
Interface to the 'UNU.RAN' library for Universal Non-Uniform RANdom variate generators. Thus it allows to build non-uniform random number generators from quite arbitrary distributions. In particular, it provides an algorithm for fast numerical inversion for distribution with given density function. In addition, the package contains densities, distribution functions and quantiles from a couple of distributions.
Maintained by Josef Leydold. Last updated 6 months ago.
2.0 match 6.87 score 180 scripts 8 dependentsaidenloe
mazeGen:Elithorn Maze Generator
A maze generator that creates the Elithorn Maze (HTML file) and the functions to calculate the associated maze parameters (i.e. Difficulty and Ability).
Maintained by Bao Sheng Loe (Aiden). Last updated 6 years ago.
3.8 match 3.59 score 77 scriptss3alfisc
fwildclusterboot:Fast Wild Cluster Bootstrap Inference for Linear Models
Implementation of fast algorithms for wild cluster bootstrap inference developed in 'Roodman et al' (2019, 'STATA' Journal, <doi:10.1177/1536867X19830877>) and 'MacKinnon et al' (2022), which makes it feasible to quickly calculate bootstrap test statistics based on a large number of bootstrap draws even for large samples. Multiple bootstrap types as described in 'MacKinnon, Nielsen & Webb' (2022) are supported. Further, 'multiway' clustering, regression weights, bootstrap weights, fixed effects and 'subcluster' bootstrapping are supported. Further, both restricted ('WCR') and unrestricted ('WCU') bootstrap are supported. Methods are provided for a variety of fitted models, including 'lm()', 'feols()' (from package 'fixest') and 'felm()' (from package 'lfe'). Additionally implements a 'heteroskedasticity-robust' ('HC1') wild bootstrap. Last, the package provides an R binding to 'WildBootTests.jl', which provides additional speed gains and functionality, including the 'WRE' bootstrap for instrumental variable models (based on models of type 'ivreg()' from package 'ivreg') and hypotheses with q > 1.
Maintained by Alexander Fischer. Last updated 2 years ago.
clustered-standard-errorslinear-regression-modelswild-bootstrapwild-cluster-bootstrapopenblascppopenmp
2.0 match 24 stars 6.67 score 109 scripts 2 dependentsbayesiandemography
bage:Bayesian Estimation and Forecasting of Age-Specific Rates
Fast Bayesian estimation and forecasting of age-specific rates, probabilities, and means, based on 'Template Model Builder'.
Maintained by John Bryant. Last updated 2 days ago.
1.8 match 3 stars 7.41 score 39 scriptsjacobkap
caesar:Encrypts and Decrypts Strings
Encrypts and decrypts strings using either the Caesar cipher or a pseudorandom number generation (using set.seed()) method.
Maintained by Jacob Kaplan. Last updated 5 years ago.
2.2 match 1 stars 6.00 score 26 scriptsglsnow
TeachingDemos:Demonstrations for Teaching and Learning
Demonstration functions that can be used in a classroom to demonstrate statistical concepts, or on your own to better understand the concepts or the programming.
Maintained by Greg Snow. Last updated 1 years ago.
1.8 match 7.18 score 760 scripts 13 dependentssvmiller
stevemisc:Steve's Miscellaneous Functions
These are miscellaneous functions that I find useful for my research and teaching. The contents include themes for plots, functions for simulating quantities of interest from regression models, functions for simulating various forms of fake data for instructional/research purposes, and many more. All told, the functions provided here are broadly useful for data organization, data presentation, data recoding, and data simulation.
Maintained by Steve Miller. Last updated 8 days ago.
dplyrmixed-effects-modelsmultivariate-normal-distributiontidyverse
1.9 match 10 stars 6.85 score 392 scripts 2 dependentsludvigolsen
cvms:Cross-Validation for Model Selection
Cross-validate one or multiple regression and classification models and get relevant evaluation metrics in a tidy format. Validate the best model on a test set and compare it to a baseline evaluation. Alternatively, evaluate predictions from an external model. Currently supports regression and classification (binary and multiclass). Described in chp. 5 of Jeyaraman, B. P., Olsen, L. R., & Wambugu M. (2019, ISBN: 9781838550134).
Maintained by Ludvig Renbo Olsen. Last updated 13 days ago.
1.2 match 39 stars 10.31 score 492 scripts 5 dependentsbioc
microRNA:Data and functions for dealing with microRNAs
Different data resources for microRNAs and some functions for manipulating them.
Maintained by "Michael Lawrence". Last updated 1 months ago.
infrastructuregenomeannotationsequencematchingcpp
3.5 match 3.48 score 7 scriptsbioc
MCbiclust:Massive correlating biclusters for gene expression data and associated methods
Custom made algorithm and associated methods for finding, visualising and analysing biclusters in large gene expression data sets. Algorithm is based on with a supplied gene set of size n, finding the maximum strength correlation matrix containing m samples from the data set.
Maintained by Robert Bentham. Last updated 5 months ago.
immunooncologyclusteringmicroarraystatisticalmethodsoftwarernaseqgeneexpression
3.0 match 4.00 score 2 scriptslhvanegasp
glmtoolbox:Set of Tools to Data Analysis using Generalized Linear Models
Set of tools for the statistical analysis of data using: (1) normal linear models; (2) generalized linear models; (3) negative binomial regression models as alternative to the Poisson regression models under the presence of overdispersion; (4) beta-binomial and random-clumped binomial regression models as alternative to the binomial regression models under the presence of overdispersion; (5) Zero-inflated and zero-altered regression models to deal with zero-excess in count data; (6) generalized nonlinear models; (7) generalized estimating equations for cluster correlated data.
Maintained by Luis Hernando Vanegas. Last updated 8 months ago.
4.0 match 1 stars 3.00 score 149 scriptsjohn-d-fox
RcmdrMisc:R Commander Miscellaneous Functions
Various statistical, graphics, and data-management functions used by the Rcmdr package in the R Commander GUI for R.
Maintained by John Fox. Last updated 1 years ago.
1.7 match 1 stars 7.00 score 432 scripts 42 dependentsmingdeyu
dgpsi:Interface to 'dgpsi' for Deep and Linked Gaussian Process Emulations
Interface to the 'python' package 'dgpsi' for Gaussian process, deep Gaussian process, and linked deep Gaussian process emulations of computer models and networks using stochastic imputation (SI). The implementations follow Ming & Guillas (2021) <doi:10.1137/20M1323771> and Ming, Williamson, & Guillas (2023) <doi:10.1080/00401706.2022.2124311> and Ming & Williamson (2023) <doi:10.48550/arXiv.2306.01212>. To get started with the package, see <https://mingdeyu.github.io/dgpsi-R/>.
Maintained by Deyu Ming. Last updated 1 months ago.
deep-gaussian-processesemulationgaussian-processessurrogate-models
2.0 match 5.99 score 76 scriptsbioc
DelayedRandomArray:Delayed Arrays of Random Values
Implements a DelayedArray of random values where the realization of the sampled values is delayed until they are needed. Reproducible sampling within any subarray is achieved by chunking where each chunk is initialized with a different random seed and stream. The usual distributions in the stats package are supported, along with scalar, vector and arrays for the parameters.
Maintained by Aaron Lun. Last updated 3 months ago.
2.3 match 5.26 score 6 scripts 1 dependentstmieno2
r.spatial.workshop.datasets:Collection of spatial datasets
This packages provides spatial datasets in various format. They are used for demonstrating spatial operations and map creation using R spatial pacakges (e.g., sf, terra, tmap).
Maintained by Taro Mieno. Last updated 6 months ago.
4.0 match 2.96 score 23 scriptsropensci
phylotaR:Automated Phylogenetic Sequence Cluster Identification from 'GenBank'
A pipeline for the identification, within taxonomic groups, of orthologous sequence clusters from 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> as the first step in a phylogenetic analysis. The pipeline depends on a local alignment search tool and is, therefore, not dependent on differences in gene naming conventions and naming errors.
Maintained by Shixiang Wang. Last updated 8 months ago.
blastngenbankpeer-reviewedphylogeneticssequence-alignment
2.0 match 23 stars 5.86 score 156 scriptsharrysouthworth
texmex:Statistical Modelling of Extreme Values
Statistical extreme value modelling of threshold excesses, maxima and multivariate extremes. Univariate models for threshold excesses and maxima are the Generalised Pareto, and Generalised Extreme Value model respectively. These models may be fitted by using maximum (optionally penalised-)likelihood, or Bayesian estimation, and both classes of models may be fitted with covariates in any/all model parameters. Model diagnostics support the fitting process. Graphical output for visualising fitted models and return level estimates is provided. For serially dependent sequences, the intervals declustering algorithm of Ferro and Segers (2003) <doi:10.1111/1467-9868.00401> is provided, with diagnostic support to aid selection of threshold and declustering horizon. Multivariate modelling is performed via the conditional approach of Heffernan and Tawn (2004) <doi:10.1111/j.1467-9868.2004.02050.x>, with graphical tools for threshold selection and to diagnose estimation convergence.
Maintained by Harry Southworth. Last updated 1 years ago.
1.8 match 7 stars 6.44 score 66 scripts 1 dependentsgabybudel
DBHC:Sequence Clustering with Discrete-Output HMMs
Provides an implementation of a mixture of hidden Markov models (HMMs) for discrete sequence data in the Discrete Bayesian HMM Clustering (DBHC) algorithm. The DBHC algorithm is an HMM Clustering algorithm that finds a mixture of discrete-output HMMs while using heuristics based on Bayesian Information Criterion (BIC) to search for the optimal number of HMM states and the optimal number of clusters.
Maintained by Gabriel Budel. Last updated 2 years ago.
4.3 match 1 stars 2.70 score 3 scriptssvmiller
codename:Generation of Code Names for Organizations, People, Projects, and Whatever Else
This creates code names that a user can consider for their organizations, their projects, themselves, people in their organizations or projects, or whatever else. The user can also supply a numeric seed (and even a character seed) for maximum reproducibility. Use is simple and the code names produced come in various types too, contingent on what the user may be desiring as a code name or nickname.
Maintained by Steve Miller. Last updated 2 years ago.
codenamereproducibilitywu-tang
2.5 match 17 stars 4.62 score 49 scriptscristianetaniguti
onemap:Construction of Genetic Maps in Experimental Crosses
Analysis of molecular marker data from model (backcrosses, F2 and recombinant inbred lines) and non-model systems (i. e. outcrossing species). For the later, it allows statistical analysis by simultaneously estimating linkage and linkage phases (genetic map construction) according to Wu et al. (2002) <doi:10.1006/tpbi.2002.1577>. All analysis are based on multipoint approaches using hidden Markov models.
Maintained by Cristiane Taniguti. Last updated 2 months ago.
1.7 match 3 stars 6.58 score 183 scriptshanase
rlecuyer:R Interface to RNG with Multiple Streams
Provides an interface to the C implementation of the random number generator with multiple independent streams developed by L'Ecuyer et al (2002). The main purpose of this package is to enable the use of this random number generator in parallel R applications.
Maintained by Hana Sevcikova. Last updated 2 years ago.
2.0 match 2 stars 5.64 score 143 scripts 6 dependentsmiraisolutions
rTRNG:Advanced and Parallel Random Number Generation via 'TRNG'
Embeds sources and headers from Tina's Random Number Generator ('TRNG') C++ library. Exposes some functionality for easier access, testing and benchmarking into R. Provides examples of how to use parallel RNG with 'RcppParallel'. The methods and techniques behind 'TRNG' are illustrated in the package vignettes and examples. Full documentation is available in Bauke (2021) <https://github.com/rabauke/trng4/blob/v4.23.1/doc/trng.pdf>.
Maintained by Riccardo Porreca. Last updated 1 years ago.
2.0 match 19 stars 5.63 score 15 scriptssdanzige
ADAPTS:Automated Deconvolution Augmentation of Profiles for Tissue Specific Cells
Tools to construct (or add to) cell-type signature matrices using flow sorted or single cell samples and deconvolve bulk gene expression data. Useful for assessing the quality of single cell RNAseq experiments, estimating the accuracy of signature matrices, and determining cell-type spillover. Please cite: Danziger SA et al. (2019) ADAPTS: Automated Deconvolution Augmentation of Profiles for Tissue Specific cells <doi:10.1371/journal.pone.0224693>.
Maintained by Samuel A Danziger. Last updated 3 years ago.
1.7 match 2 stars 6.56 score 40 scripts 1 dependentsbioc
subSeq:Subsampling of high-throughput sequencing count data
Subsampling of high throughput sequencing count data for use in experiment design and analysis.
Maintained by Andrew J. Bass. Last updated 5 months ago.
immunooncologysequencingtranscriptionrnaseqgeneexpressiondifferentialexpression
1.8 match 19 stars 6.36 score 20 scriptsropensci
nlrx:Setup, Run and Analyze 'NetLogo' Model Simulations from 'R' via 'XML'
Setup, run and analyze 'NetLogo' (<https://ccl.northwestern.edu/netlogo/>) model simulations in 'R'. 'nlrx' experiments use a similar structure as 'NetLogos' Behavior Space experiments. However, 'nlrx' offers more flexibility and additional tools for running and analyzing complex simulation designs and sensitivity analyses. The user defines all information that is needed in an intuitive framework, using class objects. Experiments are submitted from 'R' to 'NetLogo' via 'XML' files that are dynamically written, based on specifications defined by the user. By nesting model calls in future environments, large simulation design with many runs can be executed in parallel. This also enables simulating 'NetLogo' experiments on remote high performance computing machines. In order to use this package, 'Java' and 'NetLogo' (>= 5.3.1) need to be available on the executing system.
Maintained by Sebastian Hanss. Last updated 7 months ago.
agent-based-modelingindividual-based-modellingnetlogopeer-reviewed
1.3 match 78 stars 8.86 score 195 scriptsludvigolsen
xpectr:Generates Expectations for 'testthat' Unit Testing
Helps systematize and ease the process of building unit tests with the 'testthat' package by providing tools for generating expectations.
Maintained by Ludvig Renbo Olsen. Last updated 14 days ago.
1.8 match 37 stars 6.06 score 62 scriptsguyabel
migest:Methods for the Indirect Estimation of Bilateral Migration
Tools for estimating, measuring and working with migration data.
Maintained by Guy J. Abel. Last updated 1 months ago.
1.9 match 32 stars 5.80 score 86 scriptsbioc
scanMiR:scanMiR
A set of tools for working with miRNA affinity models (KdModels), efficiently scanning for miRNA binding sites, and predicting target repression. It supports scanning using miRNA seeds, full miRNA sequences (enabling 3' alignment) and KdModels, and includes the prediction of slicing and TDMD sites. Finally, it includes utility and plotting functions (e.g. for the visual representation of miRNA-target alignment).
Maintained by Pierre-Luc Germain. Last updated 5 months ago.
mirnasequencematchingalignment
1.8 match 5.89 score 52 scripts 1 dependentsprakashvs613
RankAggSIgFUR:Polynomially Bounded Rank Aggregation under Kemeny's Axiomatic Approach
Polynomially bounded algorithms to aggregate complete rankings under Kemeny's axiomatic framework. 'RankAggSIgFUR' (pronounced as rank-agg-cipher) contains two heuristics algorithms: FUR and SIgFUR. For details, please see Badal and Das (2018) <doi:10.1016/j.cor.2018.06.007>.
Maintained by Rakhi Singh. Last updated 2 years ago.
4.0 match 2.70 score 3 scriptsjohn-d-fox
norm:Analysis of Multivariate Normal Datasets with Missing Values
An integrated set of functions for the analysis of multivariate normal datasets with missing values, including implementation of the EM algorithm, data augmentation, and multiple imputation.
Maintained by John Fox. Last updated 2 years ago.
1.8 match 5.99 score 106 scripts 33 dependentsbrodieg
unitizer:Interactive R Unit Tests
Simplifies regression tests by comparing objects produced by test code with earlier versions of those same objects. If objects are unchanged the tests pass, otherwise execution stops with error details. If in interactive mode, tests can be reviewed through the provided interactive environment.
Maintained by Brodie Gaslam. Last updated 10 months ago.
1.5 match 39 stars 7.16 score 84 scriptsbioc
scMultiSim:Simulation of Multi-Modality Single Cell Data Guided By Gene Regulatory Networks and Cell-Cell Interactions
scMultiSim simulates paired single cell RNA-seq, single cell ATAC-seq and RNA velocity data, while incorporating mechanisms of gene regulatory networks, chromatin accessibility and cell-cell interactions. It allows users to tune various parameters controlling the amount of each biological factor, variation of gene-expression levels, the influence of chromatin accessibility on RNA sequence data, and so on. It can be used to benchmark various computational methods for single cell multi-omics data, and to assist in experimental design of wet-lab experiments.
Maintained by Hechen Li. Last updated 5 months ago.
singlecelltranscriptomicsgeneexpressionsequencingexperimentaldesign
1.5 match 23 stars 7.08 score 11 scriptsnflverse
nflseedR:Functions to Efficiently Simulate and Evaluate NFL Seasons
A set of functions to simulate National Football League seasons including the sophisticated tie-breaking procedures.
Maintained by Sebastian Carl. Last updated 8 days ago.
football-simulationnflseason-simulations
1.7 match 23 stars 6.32 score 34 scripts 1 dependentscrowding
iterors:Fast, Compact Iterators and Tools
A fresh take on iterators in R. Designed to be cross-compatible with the 'iterators' package, but using the 'nextOr' method will offer better performance as well as more compact code. With batteries included: includes a collection of iterator constructors and combinators ported and refined from the 'iterators', 'itertools', and 'itertools2' packages.
Maintained by Peter Meilstrup. Last updated 2 years ago.
1.8 match 4 stars 6.02 score 21 scriptsskranz
sktools:Helpful functions used in my courses
Several helpful functions that I use in my courses
Maintained by Sebastian Kranz. Last updated 4 years ago.
4.9 match 1 stars 2.15 score 28 scriptscanmod
macpan2:Fast and Flexible Compartmental Modelling
Fast and flexible compartmental modelling with Template Model Builder.
Maintained by Steve Walker. Last updated 2 hours ago.
compartmental-modelsepidemiologyforecastingmixed-effectsmodel-fittingoptimizationsimulationsimulation-modelingcpp
1.2 match 4 stars 8.89 score 246 scripts 1 dependentsliuyu-star
ODRF:Oblique Decision Random Forest for Classification and Regression
The oblique decision tree (ODT) uses linear combinations of predictors as partitioning variables in a decision tree. Oblique Decision Random Forest (ODRF) is an ensemble of multiple ODTs generated by feature bagging. Oblique Decision Boosting Tree (ODBT) applies feature bagging during the training process of ODT-based boosting trees to ensemble multiple boosting trees. All three methods can be used for classification and regression, and ODT and ODRF serve as supplements to the classical CART of Breiman (1984) <DOI:10.1201/9781315139470> and Random Forest of Breiman (2001) <DOI:10.1023/A:1010933404324> respectively.
Maintained by Yu Liu. Last updated 5 months ago.
2.0 match 7 stars 5.10 score 18 scriptsleoegidi
pivmet:Pivotal Methods for Bayesian Relabelling and k-Means Clustering
Collection of pivotal algorithms for: relabelling the MCMC chains in order to undo the label switching problem in Bayesian mixture models; fitting sparse finite mixtures; initializing the centers of the classical k-means algorithm in order to obtain a better clustering solution. For further details see Egidi, Pappadร , Pauli and Torelli (2018b)<ISBN:9788891910233>.
Maintained by Leonardo Egidi. Last updated 9 months ago.
1.7 match 5 stars 5.94 score 25 scriptscran
flip:Multivariate Permutation Tests
It implements many univariate and multivariate permutation (and rotation) tests. Allowed tests: the t one and two samples, ANOVA, linear models, Chi Squared test, rank tests (i.e. Wilcoxon, Mann-Whitney, Kruskal-Wallis), Sign test and Mc Nemar. Test on Linear Models are performed also in presence of covariates (i.e. nuisance parameters). The permutation and the rotation methods to get the null distribution of the test statistics are available. It also implements methods for multiplicity control such as Westfall & Young minP procedure and Closed Testing (Marcus, 1976) and k-FWER. Moreover, it allows to test for fixed effects in mixed effects models.
Maintained by Livio Finos. Last updated 7 years ago.
4.5 match 2.26 score 3 dependentsjimclarkatduke
mastif:Mast Inference and Forecasting
Analyzes production and dispersal of seeds dispersed from trees and recovered in seed traps. Motivated by long-term inventory plots where seed collections are used to infer seed production by each individual plant.
Maintained by James S. Clark. Last updated 12 months ago.
5.1 match 2.00 scorebioc
DaMiRseq:Data Mining for RNA-seq data: normalization, feature selection and classification
The DaMiRseq package offers a tidy pipeline of data mining procedures to identify transcriptional biomarkers and exploit them for both binary and multi-class classification purposes. The package accepts any kind of data presented as a table of raw counts and allows including both continous and factorial variables that occur with the experimental setting. A series of functions enable the user to clean up the data by filtering genomic features and samples, to adjust data by identifying and removing the unwanted source of variation (i.e. batches and confounding factors) and to select the best predictors for modeling. Finally, a "stacking" ensemble learning technique is applied to build a robust classification model. Every step includes a checkpoint that the user may exploit to assess the effects of data management by looking at diagnostic plots, such as clustering and heatmaps, RLE boxplots, MDS or correlation plot.
Maintained by Mattia Chiesa. Last updated 5 months ago.
sequencingrnaseqclassificationimmunooncologyopenjdk
1.9 match 5.32 score 7 scripts 1 dependentsreedacartwright
rbedrock:Analysis and Manipulation of Data from Minecraft Bedrock Edition
Implements an interface to Minecraft (Bedrock Edition) worlds. Supports the analysis and management of these worlds and game saves.
Maintained by Reed Cartwright. Last updated 21 days ago.
1.9 match 43 stars 5.24 score 3 scriptsdjnavarro
jasmines:Generative Art
It doesn't do much, really.
Maintained by Danielle Navarro. Last updated 4 years ago.
2.0 match 112 stars 4.90 score 141 scriptsbioc
PICB:piRNA Cluster Builder
piRNAs (short for PIWI-interacting RNAs) and their PIWI protein partners play a key role in fertility and maintaining genome integrity by restricting mobile genetic elements (transposons) in germ cells. piRNAs originate from genomic regions known as piRNA clusters. The piRNA Cluster Builder (PICB) is a versatile toolkit designed to identify genomic regions with a high density of piRNAs. It constructs piRNA clusters through a stepwise integration of unique and multimapping piRNAs and offers wide-ranging parameter settings, supported by an optimization function that allows users to test different parameter combinations to tailor the analysis to their specific piRNA system. The output includes extensive metadata columns, enabling researchers to rank clusters and extract cluster characteristics.
Maintained by Franziska Ahrend. Last updated 1 months ago.
geneticsgenomeannotationsequencingfunctionalpredictioncoveragetranscriptomics
1.8 match 5 stars 5.57 scoref-rousset
genepop:Population Genetic Data Analysis Using Genepop
Makes the Genepop software available in R. This software implements a mixture of traditional population genetic methods and some more focused developments: it computes exact tests for Hardy-Weinberg equilibrium, for population differentiation and for genotypic disequilibrium among pairs of loci; it computes estimates of F-statistics, null allele frequencies, allele size-based statistics for microsatellites, etc.; and it performs analyses of isolation by distance from pairwise comparisons of individuals or population samples.
Maintained by Franรงois Rousset. Last updated 2 years ago.
3.5 match 1 stars 2.78 score 54 scriptsproject-gen3sis
gen3sis:General Engine for Eco-Evolutionary Simulations
Contains an engine for spatially-explicit eco-evolutionary mechanistic models with a modular implementation and several support functions. It allows exploring the consequences of ecological and macroevolutionary processes across realistic or theoretical spatio-temporal landscapes on biodiversity patterns as a general term. Reference: Oskar Hagen, Benjamin Flueck, Fabian Fopp, Juliano S. Cabral, Florian Hartig, Mikael Pontarp, Thiago F. Rangel, Loic Pellissier (2021) "gen3sis: A general engine for eco-evolutionary simulations of the processes that shape Earth's biodiversity" <doi:10.1371/journal.pbio.3001340>.
Maintained by Oskar Hagen. Last updated 1 years ago.
biodiversityecologyevolutionmechanisticmodelmodelingsimulationcpp
1.3 match 29 stars 7.56 score 69 scriptsuniprjrc
fsdaR:Robust Data Analysis Through Monitoring and Dynamic Visualization
Provides interface to the 'MATLAB' toolbox 'Flexible Statistical Data Analysis (FSDA)' which is comprehensive and computationally efficient software package for robust statistics in regression, multivariate and categorical data analysis. The current R version implements tools for regression: (forward search, S- and MM-estimation, least trimmed squares (LTS) and least median of squares (LMS)), for multivariate analysis (forward search, S- and MM-estimation), for cluster analysis and cluster-wise regression. The distinctive feature of our package is the possibility of monitoring the statistics of interest as a function of breakdown point, efficiency or subset size, depending on the estimator. This is accompanied by a rich set of graphical features, such as dynamic brushing, linking, particularly useful for exploratory data analysis.
Maintained by Valentin Todorov. Last updated 1 years ago.
1.8 match 5 stars 5.37 score 93 scriptsdkyleward
ipfr:List Balancing for Reweighting and Population Synthesis
Performs iterative proportional updating given a seed table and an arbitrary number of marginal distributions. This is commonly used in population synthesis, survey raking, matrix rebalancing, and other applications. For example, a household survey may be weighted to match the known distribution of households by size from the census. An origin/ destination trip matrix might be balanced to match traffic counts. The approach used by this package is based on a paper from Arizona State University (Ye, Xin, et. al. (2009) <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.537.723&rep=rep1&type=pdf>). Some enhancements have been made to their work including primary and secondary target balance/importance, general marginal agreement, and weight restriction.
Maintained by Kyle Ward. Last updated 5 years ago.
1.8 match 5 stars 5.06 score 23 scriptsjoliencremers
bpnreg:Bayesian Projected Normal Regression Models for Circular Data
Fitting Bayesian multiple and mixed-effect regression models for circular data based on the projected normal distribution. Both continuous and categorical predictors can be included. Sampling from the posterior is performed via an MCMC algorithm. Posterior descriptives of all parameters, model fit statistics and Bayes factors for hypothesis tests for inequality constrained hypotheses are provided. See Cremers, Mulder & Klugkist (2018) <doi:10.1111/bmsp.12108> and Nuรฑez-Antonio & Guttiรฉrez-Peรฑa (2014) <doi:10.1016/j.csda.2012.07.025>.
Maintained by Jolien Cremers. Last updated 1 years ago.
1.5 match 14 stars 6.15 score 101 scriptspoissonconsulting
batchr:Batch Process Files
Processes multiple files with a user-supplied function. The key design principle is that only files which were last modified before the directory was configured are processed. A hidden file stores the configuration time and function etc while successfully processed files are automatically touched to update their modification date. As a result batch processing can be stopped and restarted and any files created (or modified or deleted) during processing are ignored.
Maintained by Joe Thorley. Last updated 2 months ago.
2.0 match 6 stars 4.56 score 8 scriptscran
MLGdata:Datasets for Use with Salvan, Sartori and Pace (2020)
Contains the datasets for use with the book Salvan, Sartori and Pace (2020, ISBN:978-88-470-4002-1) "Modelli Lineari Generalizzati".
Maintained by Nicola Sartori. Last updated 4 years ago.
9.0 match 1.00 scoreavi-kenny
SimEngine:A Modular Framework for Statistical Simulations in R
An open-source R package for structuring, maintaining, running, and debugging statistical simulations on both local and cluster-based computing environments.See full documentation at <https://avi-kenny.github.io/SimEngine/>.
Maintained by Avi Kenny. Last updated 26 days ago.
1.3 match 12 stars 7.18 score 50 scriptsjokergoo
sfcurve:2x2, 3x3 and Nxn Space-Filling Curves
Implementation of all possible forms of 2x2 and 3x3 space-filling curves, i.e., the generalized forms of the Hilbert curve <https://en.wikipedia.org/wiki/Hilbert_curve>, the Peano curve <https://en.wikipedia.org/wiki/Peano_curve> and the Peano curve in the meander type (Figure 5 in <https://eudml.org/doc/141086>). It can generates nxn curves expanded from any specific level-1 units. It also implements the H-curve and the three-dimensional Hilbert curve.
Maintained by Zuguang Gu. Last updated 6 months ago.
1.9 match 1 stars 4.66 score 13 scriptscran
hglm.data:Data for the 'hglm' Package
This data-only package was created for distributing data used in the examples of the 'hglm' package.
Maintained by Xia Shen. Last updated 6 years ago.
3.5 match 2.48 score 1 dependentsbenkeser
drtmle:Doubly-Robust Nonparametric Estimation and Inference
Targeted minimum loss-based estimators of counterfactual means and causal effects that are doubly-robust with respect both to consistency and asymptotic normality (Benkeser et al (2017), <doi:10.1093/biomet/asx053>; MJ van der Laan (2014), <doi:10.1515/ijb-2012-0038>).
Maintained by David Benkeser. Last updated 2 years ago.
causal-inferenceensemble-learningiptwstatistical-inferencetmle
1.3 match 19 stars 6.89 score 90 scripts 1 dependentscran
spuRs:Functions and Datasets for "Introduction to Scientific Programming and Simulation Using R"
Provides functions and datasets from Jones, O.D., R. Maillardet, and A.P. Robinson. 2014. An Introduction to Scientific Programming and Simulation, Using R. 2nd Ed. Chapman And Hall/CRC.
Maintained by Andrew Robinson. Last updated 7 years ago.
8.4 match 1 stars 1.00 scorechingchuan-chen
RcppBlaze:'Rcpp' Integration for the 'Blaze' High-Performance 'C++' Math Library
Blaze is an open-source, high-performance 'C++' math library for dense and sparse arithmetic. With its state-of-the-art Smart Expression Template implementation Blaze combines the elegance and ease of use of a domain-specific language with HPC-grade performance, making it one of the most intuitive and fastest 'C++' math libraries available. The 'RcppBlaze' package includes the header files from the 'Blaze' library with disabling some functionalities related to link to the thread and system libraries which make 'RcppBlaze' be a header-only library. Therefore, users do not need to install 'Blaze'.
Maintained by Ching-Chuan Chen. Last updated 11 months ago.
1.7 match 16 stars 4.86 score 2 scripts 1 dependentsjohnaponte
convdistr:Convolute Probabilistic Distributions
Convolute probabilistic distributions using the random generator function of each distribution. A new random number generator function is created that perform the mathematical operation on the individual random samples from the random generator function of each distribution. See the documentation for examples.
Maintained by Aponte John. Last updated 11 months ago.
1.9 match 2 stars 4.34 score 11 scriptsprabhanjan-tattar
ACSWR:A Companion Package for the Book "A Course in Statistics with R"
A book designed to meet the requirements of masters students. Tattar, P.N., Suresh, R., and Manjunath, B.G. "A Course in Statistics with R", J. Wiley, ISBN 978-1-119-15272-9.
Maintained by Prabhanjan Tattar. Last updated 10 years ago.
4.0 match 2.03 score 106 scriptstanaylab
tglkmeans:Efficient Implementation of K-Means++ Algorithm
Efficient implementation of K-Means++ algorithm. For more information see (1) "kmeans++ the advantages of the k-means++ algorithm" by David Arthur and Sergei Vassilvitskii (2007), Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp. 1027-1035, and (2) "The Effectiveness of Lloyd-Type Methods for the k-Means Problem" by Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman and Chaitanya Swamy <doi:10.1145/2395116.2395117>.
Maintained by Aviezer Lifshitz. Last updated 2 months ago.
algorithms-implementedkmeanscpp
1.5 match 7 stars 5.35 score 16 scriptsrivolli
utiml:Utilities for Multi-Label Learning
Multi-label learning strategies and others procedures to support multi- label classification in R. The package provides a set of multi-label procedures such as sampling methods, transformation strategies, threshold functions, pre-processing techniques and evaluation metrics. A complete overview of the matter can be seen in Zhang, M. and Zhou, Z. (2014) <doi:10.1109/TKDE.2013.39> and Gibaja, E. and Ventura, S. (2015) A Tutorial on Multi-label Learning.
Maintained by Adriano Rivolli. Last updated 4 years ago.
1.3 match 28 stars 6.39 score 87 scriptstroyhernandez
tinyspotifyr:Tinyverse R Wrapper for the 'Spotify' Web API
An R wrapper for the 'Spotify' Web API <https://developer.spotify.com/web-api/>.
Maintained by Troy Hernandez. Last updated 1 years ago.
1.7 match 13 stars 4.81 score 5 scriptscran
comsimitv:Flexible Framework for Simulating Community Assembly
Flexible framework for trait-based simulation of community assembly, where components could be replaced by user-defined function and that allows variation of traits within species.
Maintained by Zoltan Botta-Dukat. Last updated 4 years ago.
3.9 match 2.00 score 3 scriptsprabhanjan-tattar
gpk:100 Data Sets for Statistics Education
Collection of datasets as prepared by Profs. A.P. Gore, S.A. Paranjape, and M.B. Kulkarni of Department of Statistics, Poona University, India. With their permission, first letter of their names forms the name of this package, the package has been built by me and made available for the benefit of R users. This collection requires a rich class of models and can be a very useful building block for a beginner.
Maintained by Prabhanjan Tattar. Last updated 12 years ago.
4.5 match 1.69 score 49 scriptscran
mix:Estimation/Multiple Imputation for Mixed Categorical and Continuous Data
Estimation/multiple imputation programs for mixed categorical and continuous data.
Maintained by Brian Ripley. Last updated 3 months ago.
1.8 match 2 stars 4.21 score 5 dependentsbioc
rqubic:Qualitative biclustering algorithm for expression data analysis in R
This package implements the QUBIC algorithm introduced by Li et al. for the qualitative biclustering with gene expression data.
Maintained by Jitao David Zhang. Last updated 5 months ago.
2.0 match 3.78 score 4 scripts 1 dependentscran
GNAR:Methods for Fitting Network Time Series Models
Simulation of, and fitting models for, Generalised Network Autoregressive (GNAR) time series models which take account of network structure, potentially with exogenous variables. Such models are described in Knight et al. (2020) <doi:10.18637/jss.v096.i05> and Nason and Wei (2021) <doi:10.1111/rssa.12875>. Diagnostic tools for GNAR(X) models can be found in Nason et al. (2023) <doi:10.48550/arXiv.2312.00530>.
Maintained by Matt Nunes. Last updated 6 months ago.
5.8 match 2 stars 1.30 scoremikejohnson51
AHGestimation:An R package for Computing Robust, Mass Preserving Hydraulic Geometries and Rating Curves
Compute mass preserving 'At a station Hydraulic Geometry' (AHG) fits from river measurements.
Maintained by Mike Johnson. Last updated 3 months ago.
1.5 match 6 stars 5.02 score 10 scriptsjonathanlees
RSEIS:Seismic Time Series Analysis Tools
Multiple interactive codes to view and analyze seismic data, via spectrum analysis, wavelet transforms, particle motion, hodograms. Includes general time-series tools, plotting, filtering, interactive display.
Maintained by Jonathan M. Lees. Last updated 6 months ago.
1.8 match 3 stars 4.27 score 262 scripts 4 dependents