pryr:Tools for Computing on the Language

Useful tools to pry back the covers of R and understand the language at a deeper level.

Maintained by Hadley Wickham. Last updated 1 years ago.


ff:Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Maintained by Jens Oehlschlägel. Last updated 2 months ago.


lmomco:L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

Extensive functions for Lmoments (LMs) and probability-weighted moments (PWMs), distribution parameter estimation, LMs for distributions, LM ratio diagrams, multivariate Lcomoments, and asymmetric (asy) trimmed LMs (TLMs). Maximum likelihood and maximum product spacings estimation are available. Right-tail and left-tail LM censoring by threshold or indicator variable are available. LMs of residual (resid) and reversed (rev) residual life are implemented along with 13 quantile operators for reliability analyses. Exact analytical bootstrap estimates of order statistics, LMs, and LM var-covars are available. Harri-Coble Tau34-squared Normality Test is available. Distributions with L, TL, and added (+) support for right-tail censoring (RC) encompass: Asy Exponential (Exp) Power [L], Asy Triangular [L], Cauchy [TL], Eta-Mu [L], Exp. [L], Gamma [L], Generalized (Gen) Exp Poisson [L], Gen Extreme Value [L], Gen Lambda [L, TL], Gen Logistic [L], Gen Normal [L], Gen Pareto [L+RC, TL], Govindarajulu [L], Gumbel [L], Kappa [L], Kappa-Mu [L], Kumaraswamy [L], Laplace [L], Linear Mean Residual Quantile Function [L], Normal [L], 3p log-Normal [L], Pearson Type III [L], Polynomial Density-Quantile 3 and 4 [L], Rayleigh [L], Rev-Gumbel [L+RC], Rice [L], Singh Maddala [L], Slash [TL], 3p Student t [L], Truncated Exponential [L], Wakeby [L], and Weibull [L].

Maintained by William Asquith. Last updated 1 months ago.


sound:A Sound Interface for R

Basic functions for dealing with wav files and sound samples.

Maintained by Stefan Langenberg. Last updated 1 years ago.

RStata:A Bit of Glue Between R and Stata

A simple R -> Stata interface allowing the user to execute Stata commands (both inline and from a .do file) from R.

Maintained by Luca Braglia. Last updated 4 years ago.

marquee:Markdown Parser and Renderer for R Graphics

Provides the mean to parse and render markdown text with grid along with facilities to define the styling of the text.

Maintained by Thomas Lin Pedersen. Last updated 2 months ago.


mmap:Map Pages of Memory

R interface to POSIX mmap and Window's MapViewOfFile.

Maintained by Jeffrey A. Ryan. Last updated 1 years ago.

shinytest2:Testing for Shiny Applications

Automated unit testing of Shiny applications through a headless 'Chromium' browser.

Maintained by Barret Schloerke. Last updated 1 months ago.


infotheo:Information-Theoretic Measures

Implements various measures of information theory based on several entropy estimators.

Maintained by Patrick E. Meyer. Last updated 3 years ago.


voice:Voice Analysis, Speaker Recognition and Mood Inference via music theory

Voice analysis, speaker recognition and mood inference via music theory.

Maintained by Zabala Filipe J.. Last updated 2 hours ago.

homomorpheR:Homomorphic Computations in R

Homomorphic computations in R for privacy-preserving applications. Currently only the Paillier Scheme is implemented.

Maintained by Balasubramanian Narasimhan. Last updated 3 years ago.

GLDEX:Fitting Single and Mixture of Generalised Lambda Distributions

The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood and quantile matching estimation. Other methods such as moment matching, starship method, L moment matching are also provided. Diagnostics on goodness of fit can be done via qqplots, KS-resample tests and comparing mean, variance, skewness and kurtosis of the data with the fitted distribution. References include the following: Karvanen and Nuutinen (2008) "Characterizing the generalized lambda distribution by L-moments" <doi:10.1016/j.csda.2007.06.021>, King and MacGillivray (1999) "A starship method for fitting the generalised lambda distributions" <doi:10.1111/1467-842X.00089>, Su (2005) "A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data" <doi:10.22237/jmasm/1130803560>, Su (2007) "Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions" <doi:10.1016/j.csda.2006.06.008>, Su (2007) "Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R" <doi:10.18637/jss.v021.i09>, Su (2009) "Confidence Intervals for Quantiles Using Generalized Lambda Distributions" <doi:10.1016/j.csda.2009.02.014>, Su (2010) "Chapter 14: Fitting GLDs and Mixture of GLDs to Data using Quantile Matching Method" <doi:10.1201/b10159>, Su (2010) "Chapter 15: Fitting GLD to data using GLDEX 1.0.4 in R" <doi:10.1201/b10159>, Su (2015) "Flexible Parametric Quantile Regression Model" <doi:10.1007/s11222-014-9457-1>, Su (2021) "Flexible parametric accelerated failure time model"<doi:10.1080/10543406.2021.1934854>.

Maintained by Steve Su. Last updated 2 years ago.

ctsem:Continuous Time Structural Equation Modelling

Hierarchical continuous (and discrete) time state space modelling, for linear and nonlinear systems measured by continuous variables, with limited support for binary data. The subject specific dynamic system is modelled as a stochastic differential equation (SDE) or difference equation, measurement models are typically multivariate normal factor models. Linear mixed effects SDE's estimated via maximum likelihood and optimization are the default. Nonlinearities, (state dependent parameters) and random effects on all parameters are possible, using either max likelihood / max a posteriori optimization (with optional importance sampling) or Stan's Hamiltonian Monte Carlo sampling. See <> for details. Priors may be used. For the conceptual overview of the hierarchical Bayesian linear SDE approach, see <>. Exogenous inputs may also be included, for an overview of such possibilities see <> . Stan based functions are not available on 32 bit Windows systems at present. <> contains some tutorial blog posts.

Maintained by Charles Driver. Last updated 12 days ago.


c64vice:Interface to Binary Monitor in VICE C64 Emulator

Interface to the binary monitor in VICE - the c64 emulator.

Maintained by mikefc. Last updated 1 years ago.

rSHAPE:Simulated Haploid Asexual Population Evolution

In silico experimental evolution offers a cost-and-time effective means to test evolutionary hypotheses. Existing evolutionary simulation tools focus on simulations in a limited experimental framework, and tend to report on only the results presumed of interest by the tools designer. The R-package for Simulated Haploid Asexual Population Evolution ('rSHAPE') addresses these concerns by implementing a robust simulation framework that outputs complete population demographic and genomic information for in silico evolving communities. Allowing more than 60 parameters to be specified, 'rSHAPE' simulates evolution across discrete time-steps for an evolving community of haploid asexual populations with binary state genomes. These settings are for the current state of 'rSHAPE' and future steps will be to increase the breadth of evolutionary conditions permitted. At present, most effort was placed into permitting varied growth models to be simulated (such as constant size, exponential growth, and logistic growth) as well as various fitness landscape models to reflect the evolutionary landscape (e.g.: Additive, House of Cards - Stuart Kauffman and Simon Levin (1987) <doi:10.1016/S0022-5193(87)80029-2>, NK - Stuart A. Kauffman and Edward D. Weinberger (1989) <doi:10.1016/S0022-5193(89)80019-0>, Rough Mount Fuji - Neidhart, Johannes and Szendro, Ivan G and Krug, Joachim (2014) <doi:10.1534/genetics.114.167668>). This package includes numerous functions though users will only need defineSHAPE(), runSHAPE(), shapeExperiment() and summariseExperiment(). All other functions are called by these main functions and are likely only to be on interest for someone wishing to develop 'rSHAPE'. Simulation results will be stored in files which are exported to the directory referenced by the shape_workDir option (defaults to tempdir() but do change this by passing a folderpath argument for workDir when calling defineSHAPE() if you plan to make use of your results beyond your current session). 'rSHAPE' will generate numerous replicate simulations for your defined range of experimental parameters. The experiment will be built under the experimental working directory (i.e.: referenced by the option shape_workDir set using defineSHAPE() ) where individual replicate simulation results will be stored as well as processed results which I have made in an effort to facilitate analyses by automating collection and processing of the potentially thousands of files which will be created. On that note, 'rSHAPE' implements a robust and flexible framework with highly detailed output at the cost of computational efficiency and potentially requiring significant disk space (generally gigabytes but up to tera-bytes for very large simulation efforts). So, while 'rSHAPE' offers a single framework in which we can simulate evolution and directly compare the impacts of a wide range of parameters, it is not as quick to run as other in silico simulation tools which focus on a single scenario with limited output. There you have it, 'rSHAPE' offers you a less restrictive in silico evolutionary playground than other tools and I hope you enjoy testing your hypotheses.

Maintained by Jonathan Dench. Last updated 6 years ago.

FLSSS:Mining Rigs for Problems in the Subset Sum Family

Specialized solvers for combinatorial optimization problems in the Subset Sum family. The solvers differ from the mainstream in the options of (i) restricting subset size, (ii) bounding subset elements, (iii) mining real-value multisets with predefined subset sum errors, (iv) finding one or more subsets in limited time. A novel algorithm for mining the one-dimensional Subset Sum induced algorithms for the multi-Subset Sum and the multidimensional Subset Sum. The multi-threaded framework for the latter offers exact algorithms to the multidimensional Knapsack and the Generalized Assignment problems. Historical updates include (a) renewed implementation of the multi-Subset Sum, multidimensional Knapsack and Generalized Assignment solvers; (b) availability of bounding solution space in the multidimensional Subset Sum; (c) fundamental data structure and architectural changes for enhanced cache locality and better chance of SIMD vectorization; (d) option of mapping floating-point instance to compressed 64-bit integer instance with user-controlled precision loss, which could yield substantial speedup due to the dimension reduction and efficient compressed integer arithmetic via bit-manipulations; (e) distributed computing infrastructure for multidimensional subset sum; (f) arbitrary-precision zero-margin-of-error multidimensional Subset Sum accelerated by a simplified Bloom filter. The package contains a copy of xxHash from <>. Package vignette (<doi:10.48550/arXiv.1612.04484>) detailed a few historical updates. Functions prefixed with 'aux' (auxiliary) are independent implementations of published algorithms for solving optimization problems less relevant to Subset Sum.

Maintained by Charlie Wusuo Liu. Last updated 2 months ago.


