R-universe search: sort

r-lib

bit:Classes and Methods for Fast Memory-Efficient Boolean Selections

Provided are classes for boolean and skewed boolean vectors, fast boolean methods, fast unique and non-unique integer sorting, fast set operations on sorted and unsorted sets of integers, and foundations for ff (range index, compression, chunked processing).

Maintained by Michael Chirico. Last updated 7 days ago.

30.3 match 12 stars 15.15 score 131 scripts 3.2k dependents

rpolars

polars:Lightning-Fast 'DataFrame' Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Soren Welling. Last updated 5 days ago.

arrow polars rust

26.2 match 499 stars 12.01 score 1.0k scripts 2 dependents

bioc

S4Vectors:Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation bioconductor-package core-package

14.1 match 18 stars 16.05 score 1.0k scripts 1.9k dependents

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 19 days ago.

openblas cpp openmp

17.2 match 147 stars 12.54 score 1.2k scripts 166 dependents

insightsengineering

rtables:Reporting Tables

Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.

Maintained by Joe Zhu. Last updated 2 months ago.

pharmaceuticals tables

15.6 match 232 stars 13.65 score 238 scripts 17 dependents

revelle

psych:Procedures for Psychological, Psychometric, and Personality Research

A general purpose toolbox developed originally for personality, psychometric theory and experimental psychology. Functions are primarily for multivariate analysis and scale construction using factor analysis, principal component analysis, cluster analysis and reliability analysis, although others provide basic descriptive statistics. Item Response Theory is done using factor analysis of tetrachoric and polychoric correlations. Functions for analyzing data at multiple levels include within and between group statistics, including correlations and factor analysis. Validation and cross validation of scales developed using basic machine learning algorithms are provided, as are functions for simulating and testing particular item and test structures. Several functions serve as a useful front end for structural equation modeling. Graphical displays of path diagrams, including mediation models, factor analysis and structural equation models are created using basic graphics. Some of the functions are written to support a book on psychometric theory as well as publications in personality research. For more information, see the <https://personality-project.org/r/> web page.

Maintained by William Revelle. Last updated 3 months ago.

14.2 match 52 stars 13.94 score 29k scripts 317 dependents

truecluster

ff:Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Maintained by Jens Oehlschlägel. Last updated 2 months ago.

cpp

15.9 match 27 stars 12.01 score 764 scripts 71 dependents

r-lib

bit64:A S3 Class for Vectors of 64bit Integers

Package 'bit64' provides serializable S3 atomic 64bit (signed) integers. These are useful for handling database keys and exact counting in +-2^63. WARNING: do not use them as replacement for 32bit integers, integer64 are not supported for subscripting by R-core and they have different semantics when combined with double, e.g. integer64 + double => integer64. Class integer64 can be used in vectors, matrices, arrays and data.frames. Methods are available for coercion from and to logicals, integers, doubles, characters and factors as well as many elementwise and summary functions. Many fast algorithmic operations such as 'match' and 'order' support inter- active data exploration and manipulation and optionally leverage caching.

Maintained by Michael Chirico. Last updated 6 days ago.

11.6 match 35 stars 14.91 score 1.5k scripts 3.2k dependents

gpilgrim2670

SwimmeR:Data Import, Cleaning, and Conversions for Swimming Results

The goal of the 'SwimmeR' package is to provide means of acquiring, and then analyzing, data from swimming (and diving) competitions. To that end 'SwimmeR' allows results to be read in from .html sources, like 'Hy-Tek' real time results pages, '.pdf' files, 'ISL' results, 'Omega' results, and (on a development basis) '.hy3' files. Once read in, 'SwimmeR' can convert swimming times (performances) between the computationally useful format of seconds reported to the '100ths' place (e.g. 95.37), and the conventional reporting format (1:35.37) used in the swimming community. 'SwimmeR' can also score meets in a variety of formats with user defined point values, convert times between courses ('LCM', 'SCM', 'SCY') and draw single elimination brackets, as well as providing a suite of tools for working cleaning swimming data. This is a developmental package, not yet mature.

Maintained by Greg Pilgrim. Last updated 2 years ago.

32.6 match 4 stars 4.53 score 17 scripts

bioc

GenomicRanges:Representation and manipulation of genomic intervals

The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.

Maintained by Hervé Pagès. Last updated 4 months ago.

genetics infrastructure datarepresentation sequencing annotation genomeannotation coverage bioconductor-package core-package

7.8 match 44 stars 17.75 score 13k scripts 1.3k dependents

herveabdi

DistatisR:DiSTATIS Three Way Metric Multidimensional Scaling

Implement DiSTATIS and CovSTATIS (three-way multidimensional scaling). DiSTATIS and CovSTATIS are used to analyze multiple distance/covariance matrices collected on the same set of observations. These methods are based on Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012) <doi:10.1002/wics.198>.

Maintained by Herve Abdi. Last updated 1 years ago.

3-way-mds distatis metric-multidimensional-scaling

29.9 match 4 stars 4.42 score 44 scripts

husson

SensoMineR:Sensory Data Analysis

Statistical Methods to Analyse Sensory Data. SensoMineR: A package for sensory data analysis. S. Le and F. Husson (2008).

Maintained by Francois Husson. Last updated 1 years ago.

22.8 match 5.72 score 108 scripts 3 dependents

eitsupi

neopolars:R Bindings for the 'polars' Rust Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Tatsuya Shima. Last updated 18 hours ago.

rust cargo

24.9 match 40 stars 4.87 score 1 scripts

gluc

data.tree:General Purpose Hierarchical Data Structure

Create tree structures from hierarchical data, and traverse the tree in various orders. Aggregate, cumulate, print, plot, convert to and from data.frame and more. Useful for decision trees, machine learning, finance, conversion from and to JSON, and many other applications.

Maintained by Christoph Glur. Last updated 5 months ago.

9.0 match 209 stars 12.84 score 1.1k scripts 88 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 2 days ago.

fortran cpp

6.8 match 87 stars 16.70 score 7.7k scripts 99 dependents

rdatatable

data.table:Extension of `data.frame`

Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.

Maintained by Tyson Barrett. Last updated 1 days ago.

4.4 match 3.7k stars 23.53 score 230k scripts 4.6k dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure bioconductor-package core-package

7.0 match 12 stars 14.22 score 612 scripts 2.2k dependents

r-forge

Matrix:Sparse and Dense Matrix Classes and Methods

A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.

Maintained by Martin Maechler. Last updated 8 days ago.

openblas

5.6 match 1 stars 17.23 score 33k scripts 12k dependents

bioc

SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data

SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.

Maintained by Guido Barzaghi. Last updated 29 days ago.

dnamethylation coverage nucleosomepositioning datarepresentation epigenetics methylseq qualitycontrol sequencing

14.7 match 2 stars 6.43 score 27 scripts

rspatial

terra:Spatial Data Analysis

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

Maintained by Robert J. Hijmans. Last updated 10 hours ago.

geospatial raster spatial vector onetbb proj gdal geos cpp

5.3 match 559 stars 17.63 score 17k scripts 851 dependents

larmarange

labelled:Manipulating Labelled Data

Work with labelled data imported from 'SPSS' or 'Stata' with 'haven' or 'foreign'. This package provides useful functions to deal with "haven_labelled" and "haven_labelled_spss" classes introduced by 'haven' package.

Maintained by Joseph Larmarange. Last updated 28 days ago.

haven labels metadata sas spss stata

6.2 match 76 stars 15.02 score 2.4k scripts 96 dependents

talgalili

dendextend:Extending 'dendrogram' Functionality in R

Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.

Maintained by Tal Galili. Last updated 2 months ago.

5.4 match 154 stars 17.02 score 6.0k scripts 164 dependents

gagolews

stringi:Fast and Portable Character String Processing Facilities

A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).

Maintained by Marek Gagolewski. Last updated 1 months ago.

icu icu4c natural-language-processing nlp regex regexp string-manipulation stringi stringr text text-processing tidy-data unicode cpp

5.0 match 309 stars 18.31 score 10k scripts 8.6k dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 8 days ago.

autograd deep-learning torch cpp

5.3 match 520 stars 16.52 score 1.4k scripts 38 dependents

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 7 hours ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

4.1 match 583 stars 21.10 score 31k scripts 1.9k dependents

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

7.0 match 29 stars 12.34 score 6.6k scripts 931 dependents

jchiquet

aricode:Efficient Computations of Standard Clustering Comparison Measures

Implements an efficient O(n) algorithm based on bucket-sorting for fast computation of standard clustering comparison measures. Available measures include adjusted Rand index (ARI), normalized information distance (NID), normalized mutual information (NMI), adjusted mutual information (AMI), normalized variation information (NVI) and entropy, as described in Vinh et al (2009) <doi:10.1145/1553374.1553511>. Include AMI (Adjusted Mutual Information) since version 0.1.2, a modified version of ARI (MARI), as described in Sundqvist et al. <doi:10.1007/s00180-022-01230-7> and simple Chi-square distance since version 1.0.0.

Maintained by Julien Chiquet. Last updated 1 years ago.

bucket-sort clustering clustering-comparison-measures cpp

10.3 match 25 stars 8.15 score 542 scripts 14 dependents

winvector

WVPlots:Common Plots for Analysis

Select data analysis plots, under a standardized calling interface implemented on top of 'ggplot2' and 'plotly'. Plots of interest include: 'ROC', gain curve, scatter plot with marginal distributions, conditioned scatter plot with marginal densities, box and stem with matching theoretical distribution, and density with matching theoretical distribution.

Maintained by John Mount. Last updated 11 months ago.

10.4 match 85 stars 8.00 score 280 scripts

bioc

microbiome:Microbiome Analytics

Utilities for microbiome analysis.

Maintained by Leo Lahti. Last updated 5 months ago.

metagenomics microbiome sequencing systemsbiology hitchip hitchip-atlas human-microbiome microbiology microbiome-analysis phyloseq population-study

6.4 match 290 stars 12.50 score 2.0k scripts 5 dependents

eddelbuettel

nanotime:Nanosecond-Resolution Time Support for R

Full 64-bit resolution date and time functionality with nanosecond granularity is provided, with easy transition to and from the standard 'POSIXct' type. Three additional classes offer interval, period and duration functionality for nanosecond-resolution timestamps.

Maintained by Dirk Eddelbuettel. Last updated 1 months ago.

datetime datetimes nanosecond-resolution nanoseconds cpp

7.2 match 53 stars 10.91 score 134 scripts 17 dependents

mhahsler

arules:Mining Association Rules and Frequent Itemsets

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.

Maintained by Michael Hahsler. Last updated 1 months ago.

arules association-rules frequent-itemsets

5.6 match 194 stars 13.99 score 3.3k scripts 28 dependents

dbosak01

procs:Recreates Some 'SAS®' Procedures in 'R'

Contains functions to simulate the most commonly used 'SAS®' procedures. Specifically, the package aims to simulate the functionality of 'proc freq', 'proc means', 'proc ttest', 'proc reg', 'proc transpose', 'proc sort', and 'proc print'. The simulation will include recreating all statistics with the highest fidelity possible.

Maintained by David Bosak. Last updated 10 months ago.

9.8 match 6 stars 7.57 score 37 scripts 2 dependents

matthewheun

matsbyname:An Implementation of Matrix Mathematics that Respects Row and Column Names

An implementation of matrix mathematics wherein operations are performed "by name."

Maintained by Matthew Heun. Last updated 11 days ago.

11.0 match 2 stars 6.65 score 150 scripts 1 dependents

jolars

SLOPE:Sorted L1 Penalized Estimation

Efficient implementations for Sorted L-One Penalized Estimation (SLOPE): generalized linear models regularized with the sorted L1-norm (Bogdan et al. 2015). Supported models include ordinary least-squares regression, binomial regression, multinomial regression, and Poisson regression. Both dense and sparse predictor matrices are supported. In addition, the package features predictor screening rules that enable fast and efficient solutions to high-dimensional problems.

Maintained by Johan Larsson. Last updated 4 days ago.

generalized-linear-models slope sparse-regression cpp openmp

7.6 match 17 stars 9.62 score 75 scripts 3 dependents

snoweye

pbdMPI:R Interface to MPI for HPC Clusters (Programming with Big Data Project)

A simplified, efficient, interface to MPI for HPC clusters. It is a derivation and rethinking of the Rmpi package. pbdMPI embraces the prevalent parallel programming style on HPC clusters. Beyond the interface, a collection of functions for global work with distributed data and resource-independent RNG reproducibility is included. It is based on S4 classes and methods.

Maintained by Wei-Chen Chen. Last updated 6 months ago.

openmpi

10.0 match 2 stars 7.11 score 179 scripts 3 dependents

briencj

asremlPlus:Augments 'ASReml-R' in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences

Assists in automating the selection of terms to include in mixed models when 'asreml' is used to fit the models. Procedures are available for choosing models that conform to the hierarchy or marginality principle, for fitting and choosing between two-dimensional spatial models using correlation, natural cubic smoothing spline and P-spline models. A history of the fitting of a sequence of models is kept in a data frame. Also used to compute functions and contrasts of, to investigate differences between and to plot predictions obtained using any model fitting function. The content falls into the following natural groupings: (i) Data, (ii) Model modification functions, (iii) Model selection and description functions, (iv) Model diagnostics and simulation functions, (v) Prediction production and presentation functions, (vi) Response transformation functions, (vii) Object manipulation functions, and (viii) Miscellaneous functions (for further details see 'asremlPlus-package' in help). The 'asreml' package provides a computationally efficient algorithm for fitting a wide range of linear mixed models using Residual Maximum Likelihood. It is a commercial package and a license for it can be purchased from 'VSNi' <https://vsni.co.uk/> as 'asreml-R', who will supply a zip file for local installation/updating (see <https://asreml.kb.vsni.co.uk/>). It is not needed for functions that are methods for 'alldiffs' and 'data.frame' objects. The package 'asremPlus' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 29 days ago.

asreml mixed-models

7.6 match 19 stars 9.34 score 200 scripts

melff

memisc:Management of Survey Data and Presentation of Analysis Results

An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.

Maintained by Martin Elff. Last updated 13 days ago.

survey-data

5.7 match 46 stars 12.34 score 1.2k scripts 13 dependents

kkholst

mets:Analysis of Multivariate Event Times

Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.

Maintained by Klaus K. Holst. Last updated 21 hours ago.

multivariate-time-to-event survival-analysis time-to-event fortran openblas cpp

5.2 match 14 stars 13.47 score 236 scripts 42 dependents

pboutros

bedr:Genomic Region Processing using Tools Such as 'BEDTools', 'BEDOPS' and 'Tabix'

Genomic regions processing using open-source command line tools such as 'BEDTools', 'BEDOPS' and 'Tabix'. These tools offer scalable and efficient utilities to perform genome arithmetic e.g indexing, formatting and merging. bedr API enhances access to these tools as well as offers additional utilities for genomic regions processing.

Maintained by Paul C. Boutros. Last updated 6 years ago.

13.9 match 4.98 score 264 scripts 2 dependents

r-lib

vctrs:Vector Helpers

Defines new notions of prototype and size that are used to provide tools for consistent and well-founded type-coercion and size-recycling, and are in turn connected to ideas of type- and size-stability useful for analysing function interfaces.

Maintained by Davis Vaughan. Last updated 5 months ago.

s3-vectors

3.6 match 290 stars 18.97 score 1.1k scripts 13k dependents

forkonlp

N2H4:Handling Methods for Naver News Text Crawling

Provides some functions to get Korean text sample from news articles in Naver which is popular news portal service <https://news.naver.com/> in Korea.

Maintained by Chanyub Park. Last updated 1 years ago.

crawler crawling getcomments hacktoberfest hacktoberfest2021 korean naver news sort

11.0 match 216 stars 6.11 score 20 scripts

tidyverse

stringr:Simple, Consistent Wrappers for Common String Operations

A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.

Maintained by Hadley Wickham. Last updated 7 months ago.

regular-expression strings

3.0 match 628 stars 21.99 score 164k scripts 8.3k dependents

bioc

PhyloProfile:PhyloProfile

PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.

Maintained by Vinh Tran. Last updated 8 days ago.

software visualization datarepresentation multiplecomparison functionalprediction dimensionreduction bioinformatics heatmap interactive-visualizations orthologs phylogenetic-profile shiny

8.4 match 33 stars 7.77 score 10 scripts

renkun-ken

rlist:A Toolbox for Non-Tabular Data Manipulation

Provides a set of functions for data manipulation with list objects, including mapping, filtering, grouping, sorting, updating, searching, and other useful functions. Most functions are designed to be pipeline friendly so that data processing with lists can be chained.

Maintained by Kun Ren. Last updated 2 years ago.

4.6 match 206 stars 13.73 score 2.2k scripts 123 dependents

eagerai

fastai:Interface to 'fastai'

The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.

Maintained by Turgut Abdullayev. Last updated 11 months ago.

audio collaborative-filtering darknet darknet-image-classification fastai medical object-detection tabular text vision

6.6 match 118 stars 9.40 score 76 scripts

pachadotdev

cpp11armadillo:An 'Armadillo' Interface

Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.

Maintained by Mauricio Vargas Sepulveda. Last updated 27 days ago.

armadillo cpp cpp11 hacktoberfest linear-algebra

6.7 match 9 stars 9.14 score 1 scripts 16 dependents

quanteda

quanteda:Quantitative Analysis of Textual Data

A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.

Maintained by Kenneth Benoit. Last updated 2 months ago.

corpus natural-language-processing quanteda text-analytics onetbb cpp

3.6 match 851 stars 16.68 score 5.4k scripts 51 dependents

paterijk

MCDA:Support for the Multicriteria Decision Aiding Process

Support for the analyst in a Multicriteria Decision Aiding (MCDA) process with algorithms, preference elicitation and data visualisation functions. Sébastien Bigaret, Richard Hodgett, Patrick Meyer, Tatyana Mironova, Alexandru Olteanu (2017) Supporting the multi-criteria decision aiding process : R and the MCDA package, Euro Journal On Decision Processes, Volume 5, Issue 1 - 4, pages 169 - 194 <doi:10.1007/s40070-017-0064-1>.

Maintained by Patrick Meyer. Last updated 2 years ago.

9.8 match 30 stars 6.04 score 182 scripts

bioc

Biostrings:Efficient manipulation of biological strings

Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.

Maintained by Hervé Pagès. Last updated 25 days ago.

sequencematching alignment sequencing genetics dataimport datarepresentation infrastructure bioconductor-package core-package

3.3 match 61 stars 17.83 score 8.6k scripts 1.2k dependents

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 7 hours ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

3.8 match 959 stars 15.20 score 4.0k scripts 21 dependents

bioc

SummarizedExperiment:A container (S4 class) for matrix-like assays

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Maintained by Hervé Pagès. Last updated 5 months ago.

genetics infrastructure sequencing annotation coverage genomeannotation bioconductor-package core-package

3.3 match 34 stars 16.85 score 8.6k scripts 1.2k dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

6.8 match 3 stars 8.20 score 7.8k scripts 11 dependents

ropensci

beautier:'BEAUti' from R

'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAUti 2' (which is part of 'BEAST2') is a GUI tool that allows users to specify the many possible setups and generates the XML file 'BEAST2' needs to run. This package provides a way to create 'BEAST2' input files without active user input, but using R function calls instead.

Maintained by Richèl J.C. Bilderbeek. Last updated 24 days ago.

bayesian beast beast2 beauti phylogenetic-inference phylogenetics

6.3 match 13 stars 8.76 score 198 scripts 5 dependents

bioc

uSORT:uSORT: A self-refining ordering pipeline for gene selection

This package is designed to uncover the intrinsic cell progression path from single-cell RNA-seq data. It incorporates data pre-processing, preliminary PCA gene selection, preliminary cell ordering, feature selection, refined cell ordering, and post-analysis interpretation and visualization.

Maintained by Hao Chen. Last updated 5 months ago.

immunooncology rnaseq gui cellbiology dnaseq

16.2 match 3.30 score

archaeostat

ArchaeoPhases:Post-Processing of Markov Chain Monte Carlo Simulations for Chronological Modelling

Statistical analysis of archaeological dates and groups of dates. This package allows to post-process Markov Chain Monte Carlo (MCMC) simulations from 'ChronoModel' <https://chronomodel.com/>, 'Oxcal' <https://c14.arch.ox.ac.uk/oxcal.html> or 'BCal' <https://bcal.shef.ac.uk/>. It provides functions for the study of rhythms of the long term from the posterior distribution of a series of dates (tempo and activity plot). It also allows the estimation and visualization of time ranges from the posterior distribution of groups of dates (e.g. duration, transition and hiatus between successive phases) as described in Philippe and Vibet (2020) <doi:10.18637/jss.v093.c01>.

Maintained by Anne Philippe. Last updated 11 months ago.

archaeology bayesian-statistics geochronology markov-chain radiocarbon-dates

7.6 match 10 stars 6.90 score 66 scripts

gdemin

expss:Tables, Labels and Some Useful Functions from Spreadsheets and 'SPSS' Statistics

Package computes and displays tables with support for 'SPSS'-style labels, multiple and nested banners, weights, multiple-response variables and significance testing. There are facilities for nice output of tables in 'knitr', 'Shiny', '*.xlsx' files, R and 'Jupyter' notebooks. Methods for labelled variables add value labels support to base R functions and to some functions from other packages. Additionally, the package brings popular data transformation functions from 'SPSS' Statistics and 'Excel': 'RECODE', 'COUNT', 'COUNTIF', 'VLOOKUP' and etc. These functions are very useful for data processing in marketing research surveys. Package intended to help people to move data processing from 'Excel' and 'SPSS' to R.

Maintained by Gregory Demin. Last updated 11 months ago.

excel labels labels-support msexcel pivot-tables recode spss spss-statistics tables variable-labels vlookup

4.7 match 84 stars 11.00 score 1.8k scripts 4 dependents

turtletopia

versionsort:Sort and Order Version Codes

A lightweight package for sorting version codes in various forms. No strong dependencies guaranteed.

Maintained by Laura Bakala. Last updated 3 years ago.

natural-sort version-code

13.2 match 5 stars 3.88 score 4 scripts 1 dependents

matloff

partools:Tools for the 'Parallel' Package

Miscellaneous utilities for parallelizing large computations. Alternative to MapReduce. File splitting and distributed operations such as sort and aggregate. "Software Alchemy" method for parallelizing most statistical methods, presented in N. Matloff, Parallel Computation for Data Science, Chapman and Hall, 2015. Includes a debugging aid.

Maintained by Norm Matloff. Last updated 2 years ago.

6.6 match 40 stars 7.51 score 30 scripts 3 dependents

cdueben

cppcontainers:'C++' Standard Template Library Containers

Use 'C++' Standard Template Library containers interactively in R. Includes sets, unordered sets, multisets, unordered multisets, maps, unordered maps, multimaps, unordered multimaps, stacks, queues, priority queues, vectors, deques, forward lists, and lists.

Maintained by Christian Düben. Last updated 2 months ago.

cpp

10.6 match 4.70 score 1 scripts

wraff

wrMisc:Analyze Experimental High-Throughput (Omics) Data

The efficient treatment and convenient analysis of experimental high-throughput (omics) data gets facilitated through this collection of diverse functions. Several functions address advanced object-conversions, like manipulating lists of lists or lists of arrays, reorganizing lists to arrays or into separate vectors, merging of multiple entries, etc. Another set of functions provides speed-optimized calculation of standard deviation (sd), coefficient of variance (CV) or standard error of the mean (SEM) for data in matrixes or means per line with respect to additional grouping (eg n groups of replicates). A group of functions facilitate dealing with non-redundant information, by indexing unique, adding counters to redundant or eliminating lines with respect redundancy in a given reference-column, etc. Help is provided to identify very closely matching numeric values to generate (partial) distance matrixes for very big data in a memory efficient manner or to reduce the complexity of large data-sets by combining very close values. Other functions help aligning a matrix or data.frame to a reference using partial matching or to mine an experimental setup to extract patterns of replicate samples. Many times large experimental datasets need some additional filtering, adequate functions are provided. Convenient data normalization is supported in various different modes, parameter estimation via permutations or boot-strap as well as flexible testing of multiple pair-wise combinations using the framework of 'limma' is provided, too. Batch reading (or writing) of sets of files and combining data to arrays is supported, too.

Maintained by Wolfgang Raffelsberger. Last updated 7 months ago.

11.1 match 4.44 score 33 scripts 4 dependents

berndbischl

BBmisc:Miscellaneous Helper Functions for B. Bischl

Miscellaneous helper functions for and from B. Bischl and some other guys, mainly for package development.

Maintained by Bernd Bischl. Last updated 2 years ago.

4.6 match 20 stars 10.59 score 980 scripts 69 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 1 days ago.

3.6 match 845 stars 13.60 score 264 scripts 2 dependents

a-dickerson

portsort:Factor-Based Portfolio Sorts

Designed to aid both academic researchers and asset managers in conducting factor based portfolio sorts. Provides functionality to sort assets into portfolios for up to three factors via a conditional or unconditional sorting procedure.

Maintained by Alex Dickerson. Last updated 6 years ago.

20.5 match 2.32 score 21 scripts

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

3.8 match 83 stars 12.50 score 186 scripts 9 dependents

simongrund1

mitml:Tools for Multiple Imputation in Multilevel Modeling

Provides tools for multiple imputation of missing data in multilevel modeling. Includes a user-friendly interface to the packages 'pan' and 'jomo', and several functions for visualization, data management and the analysis of multiply imputed data sets.

Maintained by Simon Grund. Last updated 1 years ago.

imputation missing-data mixed-effects multilevel-data multilevel-models

3.8 match 29 stars 12.36 score 246 scripts 153 dependents

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 11 months ago.

20.4 match 2.28 score 38 scripts

bioc

ShortRead:FASTQ input and manipulation

This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

dataimport sequencing qualitycontrol bioconductor-package core-package zlib cpp

3.8 match 8 stars 12.08 score 1.8k scripts 49 dependents

bioc

GenomicAlignments:Representation and manipulation of short genomic alignments

Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure dataimport genetics sequencing rnaseq snp coverage alignment immunooncology bioconductor-package core-package

3.3 match 10 stars 13.61 score 3.1k scripts 529 dependents

agisga

grpSLOPE:Group Sorted L1 Penalized Estimation

Group SLOPE (Group Sorted L1 Penalized Estimation) is a penalized linear regression method that is used for adaptive selection of groups of significant predictors in a high-dimensional linear model. The Group SLOPE method can control the (group) false discovery rate at a user-specified level (i.e., control the expected proportion of irrelevant among all selected groups of predictors). For additional information about the implemented methods please see Brzyski, Gossmann, Su, Bogdan (2018) <doi:10.1080/01621459.2017.1411269>.

Maintained by Alexej Gossmann. Last updated 2 years ago.

cpp

9.1 match 4 stars 4.93 score 43 scripts

doi-usgs

hydroloom:Utilities to Weave Hydrologic Fabrics

A collection of utilities that support creation of network attributes for hydrologic networks. Methods and algorithms implemented are documented in Moore et al. (2019) <doi:10.3133/ofr20191096>), Cormen and Leiserson (2022) <ISBN:9780262046305> and Verdin and Verdin (1999) <doi:10.1016/S0022-1694(99)00011-6>.

Maintained by David Blodgett. Last updated 2 months ago.

5.3 match 28 stars 8.53 score 19 scripts 6 dependents

hdvinod

generalCorr:Generalized Correlations, Causal Paths and Portfolio Selection

Function gmcmtx0() computes a more reliable (general) correlation matrix. Since causal paths from data are important for all sciences, the package provides many sophisticated functions. causeSummBlk() and causeSum2Blk() give easy-to-interpret causal paths. Let Z denote control variables and compare two flipped kernel regressions: X=f(Y, Z)+e1 and Y=g(X, Z)+e2. Our criterion Cr1 says that if |e1*Y|>|e2*X| then variation in X is more "exogenous or independent" than in Y, and the causal path is X to Y. Criterion Cr2 requires |e2|<|e1|. These inequalities between many absolute values are quantified by four orders of stochastic dominance. Our third criterion Cr3, for the causal path X to Y, requires new generalized partial correlations to satisfy |r*(x|y,z)|< |r*(y|x,z)|. The function parcorVec() reports generalized partials between the first variable and all others. The package provides several R functions including get0outliers() for outlier detection, bigfp() for numerical integration by the trapezoidal rule, stochdom2() for stochastic dominance, pillar3D() for 3D charts, canonRho() for generalized canonical correlations, depMeas() measures nonlinear dependence, and causeSummary(mtx) reports summary of causal paths among matrix columns. Portfolio selection: decileVote(), momentVote(), dif4mtx(), exactSdMtx() can rank several stocks. Functions whose names begin with 'boot' provide bootstrap statistical inference, including a new bootGcRsq() test for "Granger-causality" allowing nonlinear relations. A new tool for evaluation of out-of-sample portfolio performance is outOFsamp(). Panel data implementation is now included. See eight vignettes of the package for theory, examples, and usage tips. See Vinod (2019) \doi{10.1080/03610918.2015.1122048}.

Maintained by H. D. Vinod. Last updated 1 years ago.

9.7 match 2 stars 4.48 score 63 scripts 1 dependents

bioc

seqsetvis:Set Based Visualizations for Next-Gen Sequencing Data

seqsetvis enables the visualization and analysis of sets of genomic sites in next gen sequencing data. Although seqsetvis was designed for the comparison of mulitple ChIP-seq samples, this package is domain-agnostic and allows the processing of multiple genomic coordinate files (bed-like files) and signal files (bigwig files pileups from bam file). seqsetvis has multiple functions for fetching data from regions into a tidy format for analysis in data.table or tidyverse and visualization via ggplot2.

Maintained by Joseph R Boyd. Last updated 3 months ago.

software chipseq multiplecomparison sequencing visualization

7.3 match 5.82 score 82 scripts

christopherkenny

name:Tools for Working with Names

A system for organizing column names in data. Aimed at supporting a prefix-based and suffix-based column naming scheme. Extends 'dplyr' functionality to add ordering by function and more explicit renaming.

Maintained by Christopher T. Kenny. Last updated 3 years ago.

6.8 match 2 stars 6.28 score 19k scripts

martin3141

spant:MR Spectroscopy Analysis Tools

Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.

Maintained by Martin Wilson. Last updated 1 months ago.

brain mri mrs mrshub spectroscopy fortran

4.9 match 25 stars 8.52 score 81 scripts

data-cleaning

validate:Data Validation Infrastructure

Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.

Maintained by Mark van der Loo. Last updated 14 days ago.

data-cleaning validation

3.3 match 418 stars 12.50 score 448 scripts 9 dependents

opengeos

whitebox:'WhiteboxTools' R Frontend

An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.

Maintained by Andrew Brown. Last updated 5 months ago.

geomorphometry geoprocessing geospatial gis hydrology remote-sensing rstudio

4.3 match 173 stars 9.65 score 203 scripts 2 dependents

winvector

wrapr:Wrap R Tools for Debugging and Parametric Programming

Tools for writing and debugging R code. Provides: '%.>%' dot-pipe (an 'S3' configurable pipe), unpack/to (R style multiple assignment/return), 'build_frame()'/'draw_frame()' ('data.frame' example tools), 'qc()' (quoting concatenate), ':=' (named map builder), 'let()' (converts non-standard evaluation interfaces to parametric standard evaluation interfaces, inspired by 'gtools::strmacro()' and 'base::bquote()'), and more.

Maintained by John Mount. Last updated 2 years ago.

3.7 match 137 stars 11.11 score 390 scripts 12 dependents

ms609

TreeTools:Create, Modify and Analyse Phylogenetic Trees

Efficient implementations of functions for the creation, modification and analysis of phylogenetic trees. Applications include: generation of trees with specified shapes; tree rearrangement; analysis of tree shape; rooting of trees and extraction of subtrees; calculation and depiction of split support; plotting the position of rogue taxa (Klopfstein & Spasojevic 2019) <doi:10.1371/journal.pone.0212942>; calculation of ancestor-descendant relationships, of 'stemwardness' (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>, and of tree balance (Mir et al. 2013, Lemant et al. 2022) <doi:10.1016/j.mbs.2012.10.005>, <doi:10.1093/sysbio/syac027>; artificial extinction (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>; import and export of trees from Newick, Nexus (Maddison et al. 1997) <doi:10.1093/sysbio/46.4.590>, and TNT <https://www.lillo.org.ar/phylogeny/tnt/> formats; and analysis of splits and cladistic information.

Maintained by Martin R. Smith. Last updated 1 months ago.

evolutionary-biology phylogenetic-trees phylogenetics cpp

4.1 match 21 stars 9.92 score 124 scripts 10 dependents

eriqande

rubias:Bayesian Inference from the Conditional Genetic Stock Identification Model

Implements Bayesian inference for the conditional genetic stock identification model. It allows inference of mixed fisheries and also simulation of mixtures to predict accuracy. A full description of the underlying methods is available in a recently published article in the Canadian Journal of Fisheries and Aquatic Sciences: <doi:10.1139/cjfas-2018-0016>.

Maintained by Eric C. Anderson. Last updated 1 years ago.

noaa-omics-software cpp

6.9 match 3 stars 5.90 score 89 scripts

tidy-finance

tidyfinance:Tidy Finance Helper Functions

Helper functions for empirical research in financial economics, addressing a variety of topics covered in Scheuch, Voigt, and Weiss (2023) <doi:10.1201/b23237>. The package is designed to provide shortcuts for issues extensively discussed in the book, facilitating easier application of its concepts. For more information and resources related to the book, visit <https://www.tidy-finance.org/r/index.html>.

Maintained by Christoph Scheuch. Last updated 3 months ago.

finance

5.4 match 15 stars 7.48 score 24 scripts

jakobbossek

ecr:Evolutionary Computation in R

Framework for building evolutionary algorithms for both single- and multi-objective continuous or discrete optimization problems. A set of predefined evolutionary building blocks and operators is included. Moreover, the user can easily set up custom objective functions, operators, building blocks and representations sticking to few conventions. The package allows both a black-box approach for standard tasks (plug-and-play style) and a much more flexible white-box approach where the evolutionary cycle is written by hand.

Maintained by Jakob Bossek. Last updated 1 years ago.

combinatorial-optimization evolutionary-algorithm evolutionary-algorithms evolutionary-strategy genetic-algorithm-framework metaheuristics multi-objective-optimization optimization optimization-framework cpp

5.5 match 43 stars 7.36 score 89 scripts 2 dependents

bioc

scifer:Scifer: Single-Cell Immunoglobulin Filtering of Sanger Sequences

Have you ever index sorted cells in a 96 or 384-well plate and then sequenced using Sanger sequencing? If so, you probably had some struggles to either check the electropherogram of each cell sequenced manually, or when you tried to identify which cell was sorted where after sequencing the plate. Scifer was developed to solve this issue by performing basic quality control of Sanger sequences and merging flow cytometry data from probed single-cell sorted B cells with sequencing data. scifer can export summary tables, 'fasta' files, electropherograms for visual inspection, and generate reports.

Maintained by Rodrigo Arcoverde Cerveira. Last updated 3 months ago.

preprocessing qualitycontrol sangerseq sequencing software flowcytometry singlecell

7.1 match 5 stars 5.54 score 9 scripts

nourmarzouka

multiclassPairs:Build MultiClass Pair-Based Classifiers using TSPs or RF

A toolbox to train a single sample classifier that uses in-sample feature relationships. The relationships are represented as feature1 < feature2 (e.g. gene1 < gene2). We provide two options to go with. First is based on 'switchBox' package which uses Top-score pairs algorithm. Second is a novel implementation based on random forest algorithm. For simple problems we recommend to use one-vs-rest using TSP option due to its simplicity and for being easy to interpret. For complex problems RF performs better. Both lines filter the features first then combine the filtered features to make the list of all the possible rules (i.e. rule1: feature1 < feature2, rule2: feature1 < feature3, etc...). Then the list of rules will be filtered and the most important and informative rules will be kept. The informative rules will be assembled in an one-vs-rest model or in an RF model. We provide a detailed description with each function in this package to explain the filtration and training methodology in each line. Reference: Marzouka & Eriksson (2021) <doi:10.1093/bioinformatics/btab088>.

Maintained by Nour-al-dain Marzouka. Last updated 2 years ago.

classification

8.2 match 12 stars 4.82 score 11 scripts

larmarange

ggstats:Extension to 'ggplot2' for Plotting Stats

Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots.

Maintained by Joseph Larmarange. Last updated 8 days ago.

3.0 match 37 stars 13.08 score 190 scripts 156 dependents

biometry

bipartite:Visualising Bipartite Networks and Calculating Some (Ecological) Indices

Functions to visualise webs and calculate a series of indices commonly used to describe pattern in (ecological) webs. It focuses on webs consisting of only two levels (bipartite), e.g. pollination webs or predator-prey-webs. Visualisation is important to get an idea of what we are actually looking at, while the indices summarise different aspects of the web's topology.

Maintained by Carsten F. Dormann. Last updated 8 days ago.

cpp

3.6 match 37 stars 10.93 score 592 scripts 15 dependents

lrberge

stringmagic:Character String Operations and Interpolation, Magic Edition

Performs complex string operations compactly and efficiently. Supports string interpolation jointly with over 50 string operations. Also enhances regular string functions (like grep() and co). See an introduction at <https://lrberge.github.io/stringmagic/>.

Maintained by Laurent R Berge. Last updated 7 months ago.

interpolation string cpp

3.7 match 15 stars 10.56 score 37 scripts 33 dependents

shackett

romic:R for High-Dimensional Omic Data

Represents high-dimensional data as tables of features, samples and measurements, and a design list for tracking the meaning of individual variables. Using this format, filtering, normalization, and other transformations of a dataset can be carried out in a flexible manner. 'romic' takes advantage of these transformations to create interactive 'shiny' apps for exploratory data analysis such as an interactive heatmap.

Maintained by Sean Hackett. Last updated 1 years ago.

14.4 match 1 stars 2.70 score 10 scripts

bioc

InteractionSet:Base Classes for Storing Genomic Interaction Data

Provides the GInteractions, InteractionSet and ContactMatrix objects and associated methods for storing and manipulating genomic interaction data from Hi-C and ChIA-PET experiments.

Maintained by Aaron Lun. Last updated 5 months ago.

infrastructure datarepresentation software hic cpp

4.8 match 7.97 score 250 scripts 36 dependents

therneau

survival:Survival Analysis

Contains the core survival analysis routines, including definition of Surv objects, Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models, and parametric accelerated failure time models.

Maintained by Terry M Therneau. Last updated 3 months ago.

1.9 match 400 stars 20.43 score 29k scripts 3.9k dependents

msberends

plot2:A Plotting Assistant for Fast 'ggplot2' Visualisations

A streamlined extension of 'ggplot2' designed to simplify and accelerate the creation of data visualisations. 'plot2' automates common tasks such as axis handling, plot type selection, and data transformation, allowing users to create complex, publication-ready plots with minimal code. It integrates seamlessly with the tidyverse and retains full compatibility with 'ggplot2', while offering additional conveniences like enhanced sorting, faceting, and custom theming.

Maintained by Matthijs S. Berends. Last updated 9 days ago.

ggplot2 helper plotting tidyverse

8.8 match 1 stars 4.32 score 8 scripts 1 dependents

jmw86069

jamba:Just Analysis Methods Base

Just analysis methods ('jam') base functions focused on bioinformatics. Version- and gene-centric alphanumeric sort, unique name and version assignment, colorized console and 'HTML' output, color ramp and palette manipulation, 'Rmarkdown' cache import, styled 'Excel' worksheet import and export, interpolated raster output from smooth scatter and image plots, list to delimited vector, efficient list tools.

Maintained by James M. Ward. Last updated 11 days ago.

bioinformatics

7.0 match 6 stars 5.43 score

pecanproject

PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Istem Fer. Last updated 4 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

3.8 match 216 stars 9.94 score 20 scripts 2 dependents

evolutionary-optimization-laboratory

rmoo:Multi-Objective Optimization in R

The 'rmoo' package is a framework for multi- and many-objective optimization, which allows researchers and users versatility in parameter configuration, as well as tools for analysis, replication and visualization of results. The 'rmoo' package was built as a fork of the 'GA' package by Luca Scrucca(2017) <DOI:10.32614/RJ-2017-008> and implementing the Non-Dominated Sorting Genetic Algorithms proposed by K. Deb's.

Maintained by Francisco Benitez. Last updated 5 months ago.

metaheuristics multiobjective multiobjective-optimization nsga nsga2 nsga3 optimization pareto-front

7.5 match 30 stars 5.01 score 23 scripts

sbgraves237

sos:Search Contributed R Packages, Sort by Package

Search contributed R packages, sort by package.

Maintained by Spencer Graves. Last updated 9 months ago.

5.5 match 2 stars 6.82 score 241 scripts 3 dependents

guokai8

fctutils:Advanced Factor Manipulation Utilities

Provides a collection of utility functions for manipulating and analyzing factor vectors in R. It offers tools for filtering, splitting, combining, and reordering factor levels based on various criteria. The package is designed to simplify common tasks in categorical data analysis, making it easier to work with factors in a flexible and efficient manner.

Maintained by Kai Guo. Last updated 5 months ago.

8.1 match 2 stars 4.60 score 4 scripts

leonawicz

tabr:Music Notation Syntax, Manipulation, Analysis and Transcription in R

Provides a music notation syntax and a collection of music programming functions for generating, manipulating, organizing, and analyzing musical information in R. Music syntax can be entered directly in character strings, for example to quickly transcribe short pieces of music. The package contains functions for directly performing various mathematical, logical and organizational operations and musical transformations on special object classes that facilitate working with music data and notation. The same music data can be organized in tidy data frames for a familiar and powerful approach to the analysis of large amounts of structured music data. Functions are available for mapping seamlessly between these formats and their representations of musical information. The package also provides an API to 'LilyPond' (<https://lilypond.org/>) for transcribing musical representations in R into tablature ("tabs") and sheet music. 'LilyPond' is open source music engraving software for generating high quality sheet music based on markup syntax. The package generates 'LilyPond' files from R code and can pass them to the 'LilyPond' command line interface to be rendered into sheet music PDF files or inserted into R markdown documents. The package offers nominal MIDI file output support in conjunction with rendering sheet music. The package can read MIDI files and attempts to structure the MIDI data to integrate as best as possible with the data structures and functionality found throughout the package.

Maintained by Matthew Leonawicz. Last updated 6 months ago.

guitar-tablature lilypond lilypond-api music-analysis music-data music-notation music-programming music-syntax music-transcription sheet-music

4.8 match 132 stars 7.87 score 94 scripts

bioc

VariantFiltering:Filtering of coding and non-coding genetic variants

Filter genetic variants using different criteria such as inheritance model, amino acid change consequence, minor allele frequencies across human populations, splice site strength, conservation, etc.

Maintained by Robert Castelo. Last updated 2 months ago.

genetics homo_sapiens annotation snp sequencing highthroughputsequencing

6.0 match 4 stars 6.23 score 21 scripts

aiorazabala

qmethod:Analysis of Subjective Perspectives Using Q Methodology

Analysis of Q methodology, used to identify distinct perspectives existing within a group. This methodology is used across social, health and environmental sciences to understand diversity of attitudes, discourses, or decision-making styles (for more information, see <https://qmethod.org/>). A single function runs the full analysis. Each step can be run separately using the corresponding functions: for automatic flagging of Q-sorts (manual flagging is optional), for statement scores, for distinguishing and consensus statements, and for general characteristics of the factors. The package allows to choose either principal components or centroid factor extraction, manual or automatic flagging, a number of mathematical methods for rotation (or none), and a number of correlation coefficients for the initial correlation matrix, among many other options. Additional functions are available to import and export data (from raw *.CSV, 'HTMLQ' and 'FlashQ' *.CSV, 'PQMethod' *.DAT and 'easy-htmlq' *.JSON files), to print and plot, to import raw data from individual *.CSV files, and to make printable cards. The package also offers functions to print Q cards and to generate Q distributions for study administration. See further details in the package documentation, and in the web pages below, which include a cookbook, guidelines for more advanced analysis (how to perform manual flagging or change the sign of factors), data management, and a graphical user interface (GUI) for online and offline use.

Maintained by Aiora Zabala. Last updated 1 years ago.

6.0 match 38 stars 6.03 score 47 scripts

lbb220

GWmodel:Geographically-Weighted Models

Techniques from a particular branch of spatial statistics,termed geographically-weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localised calibration provides a better description. 'GWmodel' includes functions to calibrate: GW summary statistics (Brunsdon et al., 2002)<doi: 10.1016/s0198-9715(01)00009-6>, GW principal components analysis (Harris et al., 2011)<doi: 10.1080/13658816.2011.554838>, GW discriminant analysis (Brunsdon et al., 2007)<doi: 10.1111/j.1538-4632.2007.00709.x> and various forms of GW regression (Brunsdon et al., 1996)<doi: 10.1111/j.1538-4632.1996.tb00936.x>; some of which are provided in basic and robust (outlier resistant) forms.

Maintained by Binbin Lu. Last updated 6 months ago.

openblas cpp openmp

5.7 match 18 stars 6.38 score 266 scripts 4 dependents

calvagone

campsismod:Generic Implementation of a PK/PD Model

A generic, easy-to-use and expandable implementation of a pharmacokinetic (PK) / pharmacodynamic (PD) model based on the S4 class system. This package allows the user to read/write a pharmacometric model from/to files and adapt it further on the fly in the R environment. For this purpose, this package provides an intuitive API to add, modify or delete equations, ordinary differential equations (ODE's), model parameters or compartment properties (like infusion duration or rate, bioavailability and initial values). Finally, this package also provides a useful export of the model for use with simulation packages 'rxode2' and 'mrgsolve'. This package is designed and intended to be used with package 'campsis', a PK/PD simulation platform built on top of 'rxode2' and 'mrgsolve'.

Maintained by Nicolas Luyckx. Last updated 1 days ago.

5.3 match 5 stars 6.82 score 42 scripts 1 dependents

ajrgodfrey

BrailleR:Improved Access for Blind Users

Blind users do not have access to the graphical output from R without printing the content of graphics windows to an embosser of some kind. This is not as immediate as is required for efficient access to statistical output. The functions here are created so that blind people can make even better use of R. This includes the text descriptions of graphs, convenience functions to replace the functionality offered in many GUI front ends, and experimental functionality for optimising graphical content to prepare it for embossing as tactile images.

Maintained by A. Jonathan R. Godfrey. Last updated 11 months ago.

4.0 match 123 stars 8.90 score 143 scripts

tguillerme

dispRity:Measuring Disparity

A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.

Maintained by Thomas Guillerme. Last updated 4 days ago.

disparity ecology multidimensionality palaeobiology

4.1 match 26 stars 8.69 score 220 scripts 1 dependents

zarquon42b

Morpho:Calculations and Visualisations Related to Geometric Morphometrics

A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.

Maintained by Stefan Schlager. Last updated 5 months ago.

openblas cpp openmp

3.5 match 51 stars 10.00 score 218 scripts 13 dependents

bioc

MACSQuantifyR:Fast treatment of MACSQuantify FACS data

Automatically process the metadata of MACSQuantify FACS sorter. It runs multiple modules: i) imports of raw file and graphical selection of duplicates in well plate, ii) computes statistics on data and iii) can compute combination index.

Maintained by Raphaël Bonnet. Last updated 5 months ago.

dataimport preprocessing normalization flowcytometry datarepresentation gui

7.8 match 4.48 score 3 scripts

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 8 days ago.

1.9 match 584 stars 18.71 score 7.2k scripts 380 dependents

zdebruine

RcppML:Rcpp Machine Learning Library

Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.

Maintained by Zach DeBruine. Last updated 2 years ago.

clustering matrix-factorization nmf rcpp rcppeigen sparse-matrix cpp openmp

3.3 match 104 stars 10.53 score 125 scripts 46 dependents

grunwaldlab

metacoder:Tools for Parsing, Manipulating, and Graphing Taxonomic Abundance Data

Reads, plots, and manipulates large taxonomic data sets, like those generated from modern high-throughput sequencing, such as metabarcoding (i.e. amplification metagenomics, 16S metagenomics, etc). It provides a tree-based visualization called "heat trees" used to depict statistics for every taxon in a taxonomy using color and size. It also provides various functions to do common tasks in microbiome bioinformatics on data in the 'taxmap' format defined by the 'taxa' package. The 'metacoder' package is described in the publication by Foster et al. (2017) <doi:10.1371/journal.pcbi.1005404>.

Maintained by Zachary Foster. Last updated 1 months ago.

community-diversity hierarchical metabarcoding pcr taxonomy trees cpp

3.6 match 140 stars 9.64 score 328 scripts

dankelley

oce:Analysis of Oceanographic Data

Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.

Maintained by Dan Kelley. Last updated 3 days ago.

oceanography fortran cpp

2.3 match 146 stars 15.42 score 4.2k scripts 18 dependents

hrbrmstr

vegalite:Tools to Encode Visualizations with the 'Grammar of Graphics'-Like 'Vega-Lite' 'Spec'

The 'Vega-Lite' 'JavaScript' framework provides a higher-level grammar for visual analysis, akin to 'ggplot' or 'Tableau', that generates complete 'Vega' specifications. Functions exist which enable building a valid 'spec' from scratch or importing a previously created 'spec' file. Functions also exist to export 'spec' files and to generate code which will enable plots to be embedded in properly configured web pages. The default behavior is to generate an 'htmlwidget'.

Maintained by Bob Rudis. Last updated 7 years ago.

data-visualization datavisualization vega-lite vega-lite-spec visualization widget

4.5 match 158 stars 7.60 score 84 scripts

bioc

methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results

methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.

Maintained by Altuna Akalin. Last updated 18 days ago.

dnamethylation sequencing methylseq genome-biology methylation statistical-analysis visualization curl bzip2 xz-utils zlib cpp

2.9 match 220 stars 11.80 score 578 scripts 3 dependents

tingtingzhan

fmx:Finite Mixture Parametrization

A parametrization framework for finite mixture distribution using S4 objects. Density, cumulative density, quantile and simulation functions are defined. Currently normal, Tukey g-&-h, skew-normal and skew-t distributions are well tested. The gamma, negative binomial distributions are being tested.

Maintained by Tingting Zhan. Last updated 2 days ago.

10.8 match 3.18 score 7 scripts 1 dependents

kos59125

naturalsort:Natural Ordering

Provides functions related to human natural ordering. It handles adjacent digits in a character sequence as a number so that natural sort function arranges a character vector by their numbers, not digit characters. It is typically seen when operating systems lists file names. For example, a sequence a-1.png, a-2.png, a-10.png looks naturally ordered because 1 < 2 < 10 and natural sort algorithm arranges so whereas general sort algorithms arrange it into a-1.png, a-10.png, a-2.png owing to their third and fourth characters.

Maintained by Kosei Abe. Last updated 9 years ago.

4.9 match 9 stars 6.86 score 201 scripts 19 dependents

david-barnett

microViz:Microbiome Data Analysis and Visualization

Microbiome data visualization and statistics tools built upon phyloseq.

Maintained by David Barnett. Last updated 3 months ago.

microbiome microbiome-analysis microbiota

5.3 match 114 stars 6.22 score 480 scripts

tidyverse

forcats:Tools for Working with Categorical Variables (Factors)

Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, 'anonymising', and manually 'recoding').

Maintained by Hadley Wickham. Last updated 1 years ago.

factor tidyverse

1.7 match 555 stars 18.77 score 21k scripts 1.2k dependents

ddsjoberg

gtsummary:Presentation-Ready Data Summary and Analytic Result Tables

Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.

Maintained by Daniel D. Sjoberg. Last updated 4 days ago.

easy-to-use gt html5 regression-models reproducibility reproducible-research statistics summary-statistics summary-tables table1 tableone

1.9 match 1.1k stars 17.00 score 8.2k scripts 15 dependents

r-gregmisc

gtools:Various R Programming Tools

Functions to assist in R programming, including: - assist in developing, updating, and maintaining R and R packages ('ask', 'checkRVersion', 'getDependencies', 'keywords', 'scat'), - calculate the logit and inverse logit transformations ('logit', 'inv.logit'), - test if a value is missing, empty or contains only NA and NULL values ('invalid'), - manipulate R's .Last function ('addLast'), - define macros ('defmacro'), - detect odd and even integers ('odd', 'even'), - convert strings containing non-ASCII characters (like single quotes) to plain ASCII ('ASCIIfy'), - perform a binary search ('binsearch'), - sort strings containing both numeric and character components ('mixedsort'), - create a factor variable from the quantiles of a continuous variable ('quantcut'), - enumerate permutations and combinations ('combinations', 'permutation'), - calculate and convert between fold-change and log-ratio ('foldchange', 'logratio2foldchange', 'foldchange2logratio'), - calculate probabilities and generate random numbers from Dirichlet distributions ('rdirichlet', 'ddirichlet'), - apply a function over adjacent subsets of a vector ('running'), - modify the TCP_NODELAY ('de-Nagle') flag for socket objects, - efficient 'rbind' of data frames, even if the column names don't match ('smartbind'), - generate significance stars from p-values ('stars.pval'), - convert characters to/from ASCII codes ('asc', 'chr'), - convert character vector to ASCII representation ('ASCIIfy'), - apply title capitalization rules to a character vector ('capwords').

Maintained by Ben Bolker. Last updated 9 months ago.

2.2 match 25 stars 14.47 score 11k scripts 1.1k dependents

harrelfe

Hmisc:Harrell Miscellaneous

Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.

Maintained by Frank E Harrell Jr. Last updated 2 days ago.

fortran

1.8 match 210 stars 17.61 score 17k scripts 750 dependents

statnet

statnet.common:Common R Scripts and Utilities Used by the Statnet Project Software

Non-statistical utilities used by the software developed by the Statnet Project. They may also be of use to others.

Maintained by Pavel N. Krivitsky. Last updated 28 days ago.

2.7 match 8 stars 11.42 score 197 scripts 148 dependents

bioc

GenomicTuples:Representation and Manipulation of Genomic Tuples

GenomicTuples defines general purpose containers for storing genomic tuples. It aims to provide functionality for tuples of genomic co-ordinates that are analogous to those available for genomic ranges in the GenomicRanges Bioconductor package.

Maintained by Peter Hickey. Last updated 4 months ago.

infrastructure datarepresentation sequencing cpp

5.6 match 4 stars 5.48 score 7 scripts

ropensci

RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management

Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.

Maintained by Mathew W. McLean. Last updated 4 months ago.

peer-reviewed

2.5 match 115 stars 12.06 score 2.3k scripts 16 dependents

ropensci

git2rdata:Store and Retrieve Data.frames in a Git Repository

The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette("version_control", package = "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency", package = "git2rdata") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.

Maintained by Thierry Onkelinx. Last updated 2 months ago.

reproducible-research version-control

3.0 match 99 stars 10.03 score 216 scripts 4 dependents

r-lib

lintr:A 'Linter' for R Code

Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.

Maintained by Michael Chirico. Last updated 13 hours ago.

linter

1.8 match 1.2k stars 16.99 score 916 scripts 33 dependents

mayoverse

arsenal:An Arsenal of 'R' Functions for Large-Scale Statistical Summaries

An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.

Maintained by Ethan Heinzen. Last updated 7 months ago.

baseline-characteristics descriptive-statistics modeling paired-comparisons reporting statistics tableone

2.2 match 225 stars 13.45 score 1.2k scripts 16 dependents

easystats

parameters:Processing of Model Parameters

Utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models (see list of supported models using the function 'insight::supported_models()'), this package implements features like bootstrapping or simulating of parameters and models, feature reduction (feature extraction and variable selection) as well as functions to describe data and variable characteristics (e.g. skewness, kurtosis, smoothness or distribution).

Maintained by Daniel Lüdecke. Last updated 12 hours ago.

beta bootstrap ci confidence-intervals data-reduction easystats fa feature-extraction feature-reduction hacktoberfest parameters pca pvalues regression-models robust-statistics standardize standardized-estimates statistical-models

1.9 match 454 stars 15.65 score 1.8k scripts 56 dependents

davidnsousa

qsort:Scoring Q-Sort Data

Computes scores from Q-sort data, using criteria sorts and derived scales from subsets of items. The 'qsort' package includes descriptions and scoring procedures for four different Q-sets commonly used in developmental psychology research: Attachment Q-set (version 3.0) (Waters, 1995, <doi:10.1111/j.1540-5834.1995.tb00214.x>); California Child Q-set (Block and Block, 1969, <doi:10.1037/0012-1649.21.3.508>); Maternal Behaviour Q-set (version 3.1) (Pederson et al., 1999, <https://ir.lib.uwo.ca/cgi/viewcontent.cgi?article=1000&context=psychologypub>); Preschool Q-set (Baumrind, 1968 revised by Wanda Bronson, <doi:10.1111/j.1540-5834.1995.tb00214.x>).

Maintained by David N Sousa. Last updated 6 years ago.

10.6 match 2.70 score 8 scripts

alan9956

nsga2R:Elitist Non-Dominated Sorting Genetic Algorithm

Box-constrained multiobjective optimization using the elitist non-dominated sorting genetic algorithm - NSGA-II. Fast non-dominated sorting, crowding distance, tournament selection, simulated binary crossover, and polynomial mutation are called in the main program. The methods are described in Deb et al. (2002) <doi:10.1109/4235.996017>.

Maintained by Ming-Chang (Alan) Lee. Last updated 3 years ago.

9.0 match 3.12 score 29 scripts 5 dependents

bioc

STRINGdb:STRINGdb - Protein-Protein Interaction Networks and Functional Enrichment Analysis

The STRINGdb package provides a R interface to the STRING protein-protein interactions database (https://string-db.org).

Maintained by Damian Szklarczyk. Last updated 5 months ago.

network

3.4 match 8.08 score 344 scripts 9 dependents

kcuilla

reactablefmtr:Streamlined Table Styling and Formatting for Reactable

Provides various features to streamline and enhance the styling of interactive reactable tables with easy-to-use and highly-customizable functions and themes. Apply conditional formatting to cells with data bars, color scales, color tiles, and icon sets. Utilize custom table themes inspired by popular websites such and bootstrap themes. Apply sparkline line & bar charts (note this feature requires the 'dataui' package which can be downloaded from <https://github.com/timelyportfolio/dataui>). Increase the portability and reproducibility of reactable tables by embedding images from the web directly into cells. Save the final table output as a static image or interactive file.

Maintained by Kyle Cuilla. Last updated 2 years ago.

customization data-visualization easy-to-use reproducible tables

3.1 match 209 stars 8.79 score 460 scripts 4 dependents

spkaluzny

splusTimeDate:Times and Dates from 'S-PLUS'

A collection of classes and methods for working with times and dates. The code was originally available in 'S-PLUS'.

Maintained by Stephen Kaluzny. Last updated 2 months ago.

datetime

5.6 match 4.94 score 58 scripts 2 dependents

dbosak01

common:Solutions for Common Problems in Base R

Contains functions for solving commonly encountered problems while programming in R. This package is intended to provide a lightweight supplement to Base R, and will be useful for almost any R user.

Maintained by David Bosak. Last updated 12 months ago.

3.3 match 6 stars 8.21 score 193 scripts 11 dependents

spatstat

spatstat.geom:Geometrical Functionality of the 'spatstat' Family

Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)

Maintained by Adrian Baddeley. Last updated 9 hours ago.

classes-and-objects distance-calculation geometry geometry-processing images mensuration plotting point-patterns spatial-data spatial-data-analysis

2.3 match 7 stars 12.10 score 241 scripts 227 dependents

hzhanghenry

RCircos:Circos 2D Track Plot

A simple and flexible way to generate Circos 2D track plot images for genomic data visualization is implemented in this package. The types of plots include: heatmap, histogram, lines, scatterplot, tiles and plot items for further decorations include connector, link (lines and ribbons), and text (gene) label. All functions require only R graphics package that comes with R base installation.

Maintained by Hongen Zhang. Last updated 3 years ago.

3.8 match 6 stars 7.21 score 298 scripts 3 dependents

hojsgaard

doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities

Utility package containing: 1) Facilities for working with grouped data: 'do' something to data stratified 'by' some variables. 2) LSmeans (least-squares means), general linear estimates. 3) Restrict functions to a smaller domain. 4) Miscellaneous other utilities.

Maintained by Søren Højsgaard. Last updated 6 days ago.

1.8 match 1 stars 14.94 score 3.2k scripts 939 dependents

poissonconsulting

chk:Check User-Supplied Function Arguments

For developers to check user-supplied function arguments. It is designed to be simple, fast and customizable. Error messages follow the tidyverse style guide.

Maintained by Joe Thorley. Last updated 2 months ago.

chk

2.3 match 48 stars 11.89 score 22 scripts 95 dependents

skoestlmeier

monotonicity:Test for Monotonicity in Expected Asset Returns, Sorted by Portfolios

Test for monotonicity in financial variables sorted by portfolios. It is conventional practice in empirical research to form portfolios of assets ranked by a certain sort variable. A t-test is then used to consider the mean return spread between the portfolios with the highest and lowest values of the sort variable. Yet comparing only the average returns on the top and bottom portfolios does not provide a sufficient way to test for a monotonic relation between expected returns and the sort variable. This package provides nonparametric tests for the full set of monotonic patterns by Patton, A. and Timmermann, A. (2010) <doi:10.1016/j.jfineco.2010.06.006> and compares the proposed results with extant alternatives such as t-tests, Bonferroni bounds, and multivariate inequality tests through empirical applications and simulations.

Maintained by Siegfried Köstlmeier. Last updated 3 years ago.

monotonicity portfolio-analysis

7.1 match 10 stars 3.70 score 10 scripts

polmine

polmineR:Verbs and Nouns for Corpus Analysis

Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.

Maintained by Andreas Blaette. Last updated 1 years ago.

3.3 match 49 stars 7.96 score 311 scripts

impaug

UpAndDownPlots:Displays Percentage and Absolute Changes

Displays percentage changes by height and absolute changes by area for up to three nested or non-nested levels. The plots visualise changes in indices and markets, showing how the changes for sectors or for individual components contribute to the overall change. Data can be classified by up to three levels of grouping variables in a layered, hierarchical plot. Each level can be ordered in several ways including by baseline, by percentage change, and by absolute change. The vignettes give examples.

Maintained by Antony Unwin. Last updated 12 months ago.

8.0 match 3.30 score 6 scripts

ludvigolsen

rearrr:Rearranging Data

Arrange data by a set of methods. Use rearrangers to reorder data points and mutators to change their values. From basic utilities, to centering the greatest value, to swirling in 3-dimensional space, 'rearrr' enables creativity when plotting and experimenting with data.

Maintained by Ludvig Renbo Olsen. Last updated 12 days ago.

arrange cluster expand forming generate ggplot2 order plotting-in-r roll rotate shaping swirl transformations

3.6 match 24 stars 7.26 score 128 scripts 8 dependents

dmurdoch

rgl:3D Visualization Using OpenGL

Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.). Output may be on screen using OpenGL, or to various standard 3D file formats including WebGL, PLY, OBJ, STL as well as 2D image formats, including PNG, Postscript, SVG, PGF.

Maintained by Duncan Murdoch. Last updated 2 months ago.

graphics opengl rgl webgl libglu libglvnd libpng libx11 freetype cpp

1.5 match 91 stars 17.35 score 7.3k scripts 303 dependents

branchlab

metasnf:Meta Clustering with Similarity Network Fusion

Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.

Maintained by Prashanth S Velayudhan. Last updated 7 days ago.

bioinformatics clustering metaclustering snf

3.2 match 8 stars 8.21 score 30 scripts

jmbarbone

mark:Miscellaneous, Analytic R Kernels

Miscellaneous functions and wrappers for development in other packages created, maintained by Jordan Mark Barbone.

Maintained by Jordan Mark Barbone. Last updated 1 months ago.

5.3 match 6 stars 4.95 score 9 scripts

bioc

podkat:Position-Dependent Kernel Association Test

This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.

Maintained by Ulrich Bodenhofer. Last updated 5 months ago.

genetics wholegenome annotation variantannotation sequencing dataimport curl bzip2 xz-utils zlib cpp

5.2 match 5.02 score 6 scripts

clobatofern95

GPUmatrix:Basic Linear Algebra with GPU

GPUs are great resources for data analysis, especially in statistics and linear algebra. Unfortunately, very few packages connect R to the GPU, and none of them are transparent enough to run the computations on the GPU without substantial changes to the code. The maintenance of these packages is cumbersome: several of the earlier attempts have been removed from their respective repositories. It would be desirable to have a properly maintained R package that takes advantage of the GPU with minimal changes to the existing code. We have developed the GPUmatrix package (available on CRAN). GPUmatrix mimics the behavior of the Matrix package and extends R to use the GPU for computations. It includes single(FP32) and double(FP64) precision data types, and provides support for sparse matrices. It is easy to learn, and requires very few code changes to perform the operations on the GPU. GPUmatrix relies on either the Torch or Tensorflow R packages to perform the GPU operations. We have demonstrated its usefulness for several statistical applications and machine learning applications: non-negative matrix factorization, logistic regression and general linear models. We have also included a comparison of GPU and CPU performance on different matrix operations.

Maintained by Cesar Lobato-Fernandez. Last updated 1 years ago.

6.6 match 3.93 score 57 scripts 1 dependents

mb4511

DTwrappers:Simplified Data Analysis with Wrapper Functions for the 'Data.Table' Package

Provides functionality for users who are learning R or the techniques of data analysis. Written as a collection of wrapper functions, the 'DTwrapper' package facilitates many core operations of data processing. This is achieved with relatively few requirements about the order of the processing steps or knowledge of specialized syntax. 'DTwrappers' creates coding results along with translations to data.table's code. This enables users to benefit from the speed and efficiency of data.table's calculations. Furthermore, the package also provides the translated code for educational purposes so that users can review working examples of coding syntax and calculations.

Maintained by Mayur Bansal. Last updated 4 years ago.

9.2 match 2.78 score 9 scripts 2 dependents

ips-lmu

emuR:Main Package of the EMU Speech Database Management System

Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.

Maintained by Markus Jochim. Last updated 1 years ago.

3.7 match 24 stars 6.89 score 135 scripts 1 dependents

sherrillmix

taxonomizr:Functions to Work with NCBI Accessions and Taxonomy

Functions for assigning taxonomy to NCBI accession numbers and taxon IDs based on NCBI's accession2taxid and taxdump files. This package allows the user to download NCBI data dumps and create a local database for fast and local taxonomic assignment.

Maintained by Scott Sherrill-Mix. Last updated 6 days ago.

2.9 match 72 stars 8.85 score 255 scripts 2 dependents

bioc

Rsamtools:Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import

This package provides an interface to the 'samtools', 'bcftools', and 'tabix' utilities for manipulating SAM (Sequence Alignment / Map), FASTA, binary variant call (BCF) and compressed indexed tab-delimited (tabix) files.

Maintained by Bioconductor Package Maintainer. Last updated 4 months ago.

dataimport sequencing coverage alignment qualitycontrol bioconductor-package core-package curl bzip2 xz-utils zlib cpp

1.7 match 28 stars 15.42 score 3.2k scripts 566 dependents

josherrickson

rlemon:R Access to LEMON Graph Algorithms

Allows easy access to the LEMON Graph Library set of algorithms, written in C++. See the LEMON project page at <https://lemon.cs.elte.hu/trac/lemon>. Current LEMON version is 1.3.1.

Maintained by Josh Errickson. Last updated 2 months ago.

cpp

3.6 match 8 stars 7.04 score 1 scripts 15 dependents

bioc

plyranges:A fluent interface for manipulating GenomicRanges

A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.

Maintained by Michael Love. Last updated 5 months ago.

infrastructure datarepresentation workflowstep coverage bioconductor data-analysis dplyr genomic-ranges genomics tidy-data

2.0 match 143 stars 12.60 score 1.9k scripts 20 dependents

dmpe

urlshorteneR:R Wrapper for the 'Bit.ly' and 'Is.gd'/'v.gd' URL Shortening Services

Allows using two URL shortening services, which also provide expanding and analytic functions. Specifically developed for 'Bit.ly' (which requires OAuth 2.0) and 'is.gd' (no API key).

Maintained by John Malc. Last updated 30 days ago.

bitly isgd shorten-urls shortener shorturl url

3.8 match 21 stars 6.70 score 53 scripts 1 dependents

pik-piam

magclass:Data Class and Tools for Handling Spatial-Temporal Data

Data class for increased interoperability working with spatial-temporal data together with corresponding functions and methods (conversions, basic calculations and basic data manipulation). The class distinguishes between spatial, temporal and other dimensions to facilitate the development and interoperability of tools build for it. Additional features are name-based addressing of data and internal consistency checks (e.g. checking for the right data order in calculations).

Maintained by Jan Philipp Dietrich. Last updated 12 days ago.

2.3 match 5 stars 11.16 score 412 scripts 56 dependents

jrowen

rhandsontable:Interface to the 'Handsontable.js' Library

An R interface to the 'Handsontable' JavaScript library, which is a minimalist Excel-like data grid editor. See <https://handsontable.com/> for details.

Maintained by Jonathan Owen. Last updated 3 years ago.

handsontable htmlwidgets javascript shiny sparkline

2.0 match 389 stars 12.31 score 1.0k scripts 46 dependents

bioc

pcaMethods:A collection of PCA methods

Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.

Maintained by Henning Redestig. Last updated 5 months ago.

bayesian cpp

1.9 match 49 stars 13.10 score 538 scripts 73 dependents

easystats

correlation:Methods for Correlation Analysis

Lightweight package for computing different kinds of correlations, such as partial correlations, Bayesian correlations, multilevel correlations, polychoric correlations, biweight correlations, distance correlations and more. Part of the 'easystats' ecosystem. References: Makowski et al. (2020) <doi:10.21105/joss.02306>.

Maintained by Brenton M. Wiernik. Last updated 14 days ago.

bayesian bayesian-correlations biserial cor correlation correlation-analysis correlations easystats gamma gaussian-graphical-models hacktoberfest matrix multilevel-correlations outliers partial partial-correlations regression robust spearman

1.7 match 439 stars 14.23 score 672 scripts 10 dependents

pharmaverse

metatools:Enable the Use of 'metacore' to Help Create and Check Dataset

Uses the metadata information stored in 'metacore' objects to check and build metadata associated columns.

Maintained by Christina Fillmore. Last updated 2 months ago.

3.9 match 10 stars 6.26 score 120 scripts

bioc

HarmonizR:Handles missing values and makes more data available

An implementation, which takes input data and makes it available for proper batch effect removal by ComBat or Limma. The implementation appropriately handles missing values by dissecting the input matrix into smaller matrices with sufficient data to feed the ComBat or limma algorithm. The adjusted data is returned to the user as a rebuild matrix. The implementation is meant to make as much data available as possible with minimal data loss.

Maintained by Simon Schlumbohm. Last updated 5 months ago.

batcheffect

5.8 match 4.20 score 16 scripts

bioc

ORFik:Open Reading Frames in Genomics

R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.

Maintained by Haakon Tjeldnes. Last updated 29 days ago.

immunooncology software sequencing riboseq rnaseq functionalgenomics coverage alignment dataimport cpp

2.3 match 33 stars 10.63 score 115 scripts 2 dependents

cran

timeSeries:Financial Time Series Objects (Rmetrics)

'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.

Maintained by Georgi N. Boshnakov. Last updated 6 months ago.

2.4 match 2 stars 9.90 score 1.3k scripts 145 dependents

rspatial

dismo:Species Distribution Modeling

Methods for species distribution modeling, that is, predicting the environmental similarity of any site to that of the locations of known occurrences of a species.

Maintained by Robert J. Hijmans. Last updated 4 months ago.

cpp

2.0 match 26 stars 11.88 score 2.8k scripts 21 dependents

billdenney

PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis

Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.

Maintained by Bill Denney. Last updated 18 days ago.

nca noncompartmental-analysis pharmacokinetics

1.9 match 73 stars 12.61 score 214 scripts 4 dependents

mablab

sftrack:Modern Classes for Tracking and Movement Data

Modern classes for tracking and movement data, building on 'sf' spatial infrastructure, and early theoretical work from Turchin (1998, ISBN: 9780878938476), and Calenge et al. (2009) <doi:10.1016/j.ecoinf.2008.10.002>. Tracking data are series of locations with at least 2-dimensional spatial coordinates (x,y), a time index (t), and individual identification (id) of the object being monitored; movement data are made of trajectories, i.e. the line representation of the path, composed by steps (the straight-line segments connecting successive locations). 'sftrack' is designed to handle movement of both living organisms and inanimate objects.

Maintained by Mathieu Basille. Last updated 2 years ago.

movement tracking-data trajectory

3.7 match 53 stars 6.42 score 5 scripts

rfastofficial

Rfast2:A Collection of Efficient and Extremely Fast R Functions II

A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.

Maintained by Manos Papadakis. Last updated 1 years ago.

openblas cpp openmp

2.9 match 38 stars 8.09 score 75 scripts 26 dependents

robinhankin

freealg:The Free Algebra

The free algebra in R with non-commuting indeterminates. Uses 'disordR' discipline (Hankin, 2022, <doi:10.48550/ARXIV.2210.03856>). To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2211.04002>.

Maintained by Robin K. S. Hankin. Last updated 7 days ago.

cpp

3.3 match 3 stars 7.07 score 10 dependents

bioc

globaltest:Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing

The global test tests groups of covariates (or features) for association with a response variable. This package implements the test with diagnostic plots and multiple testing utilities, along with several functions to facilitate the use of this test for gene set testing of GO and KEGG terms.

Maintained by Jelle Goeman. Last updated 5 months ago.

microarray onechannel bioinformatics differentialexpression go pathways

3.3 match 6.96 score 79 scripts 7 dependents

markmfredrickson

optmatch:Functions for Optimal Matching

Distance based bipartite matching using minimum cost flow, oriented to matching of treatment and control groups in observational studies ('Hansen' and 'Klopfer' 2006 <doi:10.1198/106186006X137047>). Routines are provided to generate distances from generalised linear models (propensity score matching), formulas giving variables on which to limit matched distances, stratified or exact matching directives, or calipers, alone or in combination.

Maintained by Josh Errickson. Last updated 3 months ago.

matching openblas cpp

1.9 match 47 stars 12.22 score 588 scripts 5 dependents

doi-usgs

nhdplusTools:NHDPlus Tools

Tools for traversing and working with National Hydrography Dataset Plus (NHDPlus) data. All methods implemented in 'nhdplusTools' are available in the NHDPlus documentation available from the US Environmental Protection Agency <https://www.epa.gov/waterdata/basic-information>.

Maintained by David Blodgett. Last updated 27 days ago.

2.0 match 87 stars 11.38 score 348 scripts 5 dependents

ropensci

jqr:Client for 'jq', a 'JSON' Processor

Client for 'jq', a 'JSON' processor (<https://jqlang.github.io/jq/>), written in C. 'jq' allows the following with 'JSON' data: index into, parse, do calculations, cut up and filter, change key names and values, perform conditionals and comparisons, and more.

Maintained by Jeroen Ooms. Last updated 3 months ago.

jq json

2.3 match 144 stars 10.04 score 95 scripts 28 dependents

robinhankin

mvp:Fast Symbolic Multivariate Polynomials

Fast manipulation of symbolic multivariate polynomials using the 'Map' class of the Standard Template Library. The package uses print and coercion methods from the 'mpoly' package but offers speed improvements. It is comparable in speed to the 'spray' package for sparse arrays, but retains the symbolic benefits of 'mpoly'. To cite the package in publications, use Hankin 2022 <doi:10.48550/ARXIV.2210.15991>. Uses 'disordR' discipline.

Maintained by Robin K. S. Hankin. Last updated 5 days ago.

cpp

3.3 match 9 stars 6.83 score 36 scripts 2 dependents

vochr

TapeS:Tree Taper Curves and Sorting Based on 'TapeR'

Providing new german-wide 'TapeR' Models and functions for their evaluation. Included are the most common tree species in Germany (Norway spruce, Scots pine, European larch, Douglas fir, Silver fir as well as European beech, Common/Sessile oak and Red oak). Many other species are mapped to them so that 36 tree species / groups can be processed. Single trees are defined by species code, one or multiple diameters in arbitrary measuring height and tree height. The functions then provide information on diameters along the stem, bark thickness, height of diameters, volume of the total or parts of the trunk and total and component above-ground biomass. It is also possible to calculate assortments from the taper curves. Uncertainty information is provided for diameter, volume and component biomass estimation.

Maintained by Christian Vonderach. Last updated 1 months ago.

openblas cpp

5.7 match 3.90 score 1 scripts

robinhankin

clifford:Arbitrary Dimensional Clifford Algebras

A suite of routines for Clifford algebras, using the 'Map' class of the Standard Template Library. Canonical reference: Hestenes (1987, ISBN 90-277-1673-0, "Clifford algebra to geometric calculus"). Special cases including Lorentz transforms, quaternion multiplication, and Grassmann algebra, are discussed. Vignettes presenting conformal geometric algebra, quaternions and split quaternions, dual numbers, and Lorentz transforms are included. The package follows 'disordR' discipline.

Maintained by Robin K. S. Hankin. Last updated 1 months ago.

cpp

3.3 match 5 stars 6.71 score 4 scripts

ubod

apcluster:Affinity Propagation Clustering

Implements Affinity Propagation clustering introduced by Frey and Dueck (2007) <DOI:10.1126/science.1136800>. The algorithms are largely analogous to the 'Matlab' code published by Frey and Dueck. The package further provides leveraged affinity propagation and an algorithm for exemplar-based agglomerative clustering that can also be used to join clusters obtained from affinity propagation. Various plotting functions are available for analyzing clustering results.

Maintained by Ulrich Bodenhofer. Last updated 11 months ago.

cpp

2.3 match 10 stars 9.82 score 270 scripts 25 dependents

spkaluzny

splusTimeSeries:Time Series from 'S-PLUS'

A collection of classes and methods for working with indexed rectangular data. The index values can be calendar (timeSeries class) or numeric (signalSeries class). Methods are included for aggregation, alignment, merging, and summaries. The code was originally available in 'S-PLUS'.

Maintained by Stephen Kaluzny. Last updated 6 months ago.

data-structures time-series

5.6 match 3.95 score 20 scripts 1 dependents

bioc

variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments

Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.

Maintained by Gabriel E. Hoffman. Last updated 2 months ago.

rnaseq geneexpression genesetenrichment differentialexpression batcheffect qualitycontrol regression epigenetics functionalgenomics transcriptomics normalization preprocessing microarray immunooncology software

1.9 match 7 stars 11.69 score 1.1k scripts 3 dependents

robinhankin

stokes:The Exterior Calculus

Provides functionality for working with tensors, alternating forms, wedge products, Stokes's theorem, and related concepts from the exterior calculus. Uses 'disordR' discipline (Hankin, 2022, <doi:10.48550/arXiv.2210.03856>). The canonical reference would be M. Spivak (1965, ISBN:0-8053-9021-9) "Calculus on Manifolds". To cite the package in publications please use Hankin (2022) <doi:10.48550/arXiv.2210.17008>.

Maintained by Robin K. S. Hankin. Last updated 2 months ago.

3.3 match 3 stars 6.62 score

inrae

airGRiwrm:'airGR' Integrated Water Resource Management

Semi-distributed Precipitation-Runoff Modeling based on 'airGR' package models integrating human infrastructures and their managements.

Maintained by David Dorchies. Last updated 6 months ago.

3.4 match 6.34 score 45 scripts

mathewchamberlain

SignacX:Cell Type Identification and Discovery from Single Cell Gene Expression Data

An implementation of neural networks trained with flow-sorted gene expression data to classify cellular phenotypes in single cell RNA-sequencing data. See Chamberlain M et al. (2021) <doi:10.1101/2021.02.01.429207> for more details.

Maintained by Mathew Chamberlain. Last updated 2 years ago.

cellular-phenotypes seurat single-cell-rna-seq

3.4 match 24 stars 6.46 score 34 scripts

robinhankin

disordR:Non-Ordered Vectors

Functionality for manipulating values of associative maps. The package is a dependency for mvp-type packages that use the STL map class: it traps plausible idiom that is ill-defined (implementation-specific) and returns an informative error, rather than returning a possibly incorrect result. To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2210.03856>.

Maintained by Robin K. S. Hankin. Last updated 4 months ago.

3.3 match 1 stars 6.59 score 20 dependents

toxpi

toxpiR:Create ToxPi Prioritization Models

Enables users to build 'ToxPi' prioritization models and provides functionality within the grid framework for plotting ToxPi graphs. 'toxpiR' allows for more customization than the 'ToxPi GUI' (<https://toxpi.org>) and integration into existing workflows for greater ease-of-use, reproducibility, and transparency. toxpiR package behaves nearly identically to the GUI; the package documentation includes notes about all differences. The vignettes download example files from <https://github.com/ToxPi/ToxPi-example-files>.

Maintained by Jonathon F Fleming. Last updated 7 months ago.

data-science modeling toxicology

3.3 match 11 stars 6.58 score 19 scripts

spkaluzny

splus2R:Supplemental S-PLUS Functionality in R

Currently there are many functions in S-PLUS that are missing in R. To facilitate the conversion of S-PLUS packages to R packages, this package provides some missing S-PLUS functionality in R.

Maintained by Stephen Kaluzny. Last updated 1 years ago.

3.3 match 1 stars 6.56 score 82 scripts 30 dependents

r-forge

Polychrome:Qualitative Palettes with Many Colors

Tools for creating, viewing, and assessing qualitative palettes with many (20-30 or more) colors. See Coombes and colleagues (2019) <doi:10.18637/jss.v090.c01>.

Maintained by Kevin R. Coombes. Last updated 1 months ago.

2.3 match 9.56 score 1.0k scripts 27 dependents

ishidamgm

TreeRingShape:Recording Tree-Ring Shapes of Tree Disks with Manual Digitizing and Interpolating Model

Record all tree-ring Shapefile of tree disk with GIS soft ('Qgis'<https://www.qgis.org/en/site/>) and interpolating model from high resolution tree disk image.

Maintained by Megumi ISHIDA. Last updated 4 months ago.

4.8 match 4.48 score

bernd-mueller

epos:Epilepsy Ontologies' Similarities

Analysis and visualization of similarities between epilepsy ontologies based on text mining results by comparing ranked lists of co-occurring drug terms in the BioASQ corpus. The ranked result lists of neurological drug terms co-occurring with terms from the epilepsy ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS undergo further analysis. The source data to create the ranked lists of drug names is produced using the text mining workflows described in Mueller, Bernd and Hagelstein, Alexandra (2016) <doi:10.4126/FRL01-006408558>, Mueller, Bernd et al. (2017) <doi:10.1007/978-3-319-58694-6_22>, Mueller, Bernd and Rebholz-Schuhmann, Dietrich (2020) <doi:10.1007/978-3-030-43887-6_52>, and Mueller, Bernd et al. (2022) <doi:10.1186/s13326-021-00258-w>.

Maintained by Bernd Mueller. Last updated 1 years ago.

5.3 match 4.03 score 53 scripts

gagolews

stringx:Replacements for Base String Functions Powered by 'stringi'

English is the native language for only 5% of the World population. Also, only 17% of us can understand this text. Moreover, the Latin alphabet is the main one for merely 36% of the total. The early computer era, now a very long time ago, was dominated by the US. Due to the proliferation of the internet, smartphones, social media, and other technologies and communication platforms, this is no longer the case. This package replaces base R string functions (such as grep(), tolower(), sprintf(), and strptime()) with ones that fully support the Unicode standards related to natural language and date-time processing. It also fixes some long-standing inconsistencies, and introduces some new, useful features. Thanks to 'ICU' (International Components for Unicode) and 'stringi', they are fast, reliable, and portable across different platforms.

Maintained by Marek Gagolewski. Last updated 2 months ago.

icu icu4c natural-language-processing nlp regex regexp string-manipulation stringi text text-processing unicode

4.5 match 28 stars 4.75 score 1 scripts

bioc

unifiedWMWqPCR:Unified Wilcoxon-Mann Whitney Test for testing differential expression in qPCR data

This packages implements the unified Wilcoxon-Mann-Whitney Test for qPCR data. This modified test allows for testing differential expression in qPCR data.

Maintained by Joris Meys. Last updated 5 months ago.

differentialexpression geneexpression microtitreplateassay multiplecomparison qualitycontrol software visualization qpcr

5.1 match 4.18 score

dmurdoch

plotrix:Various Plotting Functions

Lots of plots, various labeling, axis and color scaling functions. The author/maintainer died in September 2023.

Maintained by Duncan Murdoch. Last updated 1 years ago.

1.9 match 5 stars 11.31 score 9.2k scripts 361 dependents

pharmaverse

admiral:ADaM in R Asset Library

A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).

Maintained by Ben Straub. Last updated 18 hours ago.

cdisc clinical-trials open-source

1.5 match 238 stars 13.92 score 486 scripts 4 dependents

sciviews

pastecs:Package for Analysis of Space-Time Ecological Series

Regularisation, decomposition and analysis of space-time series. The pastecs R package is a PNEC-Art4 and IFREMER (Benoit Beliaeff <Benoit.Beliaeff@ifremer.fr>) initiative to bring PASSTEC 2000 functionalities to R.

Maintained by Philippe Grosjean. Last updated 1 years ago.

ecology time-series

2.0 match 4 stars 10.34 score 2.1k scripts 13 dependents

stuart-lab

Signac:Analysis of Single-Cell Chromatin Data

A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.

Maintained by Tim Stuart. Last updated 7 months ago.

atac bioinformatics single-cell zlib cpp

1.7 match 349 stars 12.19 score 3.7k scripts 1 dependents

crsh

papaja:Prepare American Psychological Association Journal Articles with R Markdown

Tools to create dynamic, submission-ready manuscripts, which conform to American Psychological Association manuscript guidelines. We provide R Markdown document formats for manuscripts (PDF and Word) and revision letters (PDF). Helper functions facilitate reporting statistical analyses or create publication-ready tables and plots.

Maintained by Frederik Aust. Last updated 19 days ago.

apa apa-guidelines journal manuscript psychology reproducible-paper reproducible-research rmarkdown

1.8 match 662 stars 11.74 score 1.7k scripts 1 dependents

bioc

MANOR:CGH Micro-Array NORmalization

Importation, normalization, visualization, and quality control functions to correct identified sources of variability in array-CGH experiments.

Maintained by Pierre Neuvial. Last updated 5 days ago.

microarray twochannel dataimport qualitycontrol preprocessing copynumbervariation normalization

4.1 match 4.98 score 1 scripts

bioc

twoddpcr:Classify 2-d Droplet Digital PCR (ddPCR) data and quantify the number of starting molecules

The twoddpcr package takes Droplet Digital PCR (ddPCR) droplet amplitude data from Bio-Rad's QuantaSoft and can classify the droplets. A summary of the positive/negative droplet counts can be generated, which can then be used to estimate the number of molecules using the Poisson distribution. This is the first open source package that facilitates the automatic classification of general two channel ddPCR data. Previous work includes 'definetherain' (Jones et al., 2014) and 'ddpcRquant' (Trypsteen et al., 2015) which both handle one channel ddPCR experiments only. The 'ddpcr' package available on CRAN (Attali et al., 2016) supports automatic gating of a specific class of two channel ddPCR experiments only.

Maintained by Anthony Chiu. Last updated 5 months ago.

ddpcr software classification

3.5 match 10 stars 5.78 score 4 scripts

2005m

kit:Data Manipulation Functions Implemented in C

Basic functions, implemented in C, for large data manipulation. Fast vectorised ifelse()/nested if()/switch() functions, psum()/pprod() functions equivalent to pmin()/pmax() plus others which are missing from base R. Most of these functions are callable at C level.

Maintained by Morgan Jacob. Last updated 6 months ago.

openmp

2.3 match 58 stars 9.11 score 92 scripts 5 dependents

david-cortes

MatrixExtra:Extra Methods for Sparse Matrices

Extends sparse matrix and vector classes from the 'Matrix' package by providing: (a) Methods and operators that work natively on CSR formats (compressed sparse row, a.k.a. 'RsparseMatrix') such as slicing/sub-setting, assignment, rbind(), mathematical operators for CSR and COO such as addition ("+") or sqrt(), and methods such as diag(); (b) Multi-threaded matrix multiplication and cross-product for many <sparse, dense> types, including the 'float32' type from 'float'; (c) Coercion methods between pairs of classes which are not present in 'Matrix', such as 'dgCMatrix' -> 'ngRMatrix', as well as convenience conversion functions; (d) Utility functions for sparse matrices such as sorting the indices or removing zero-valued entries; (e) Fast transposes that work by outputting in the opposite storage format; (f) Faster replacements for many 'Matrix' methods for all sparse types, such as slicing and elementwise multiplication. (g) Convenience functions for sparse objects, such as 'mapSparse' or a shorter 'show' method.

Maintained by David Cortes. Last updated 9 months ago.

csr sparse-matrix openblas cpp openmp

2.3 match 20 stars 9.08 score 84 scripts 29 dependents

tripartio

ale:Interpretable Machine Learning and Statistical Inference with Accumulated Local Effects (ALE)

Accumulated Local Effects (ALE) were initially developed as a model-agnostic approach for global explanations of the results of black-box machine learning algorithms. ALE has a key advantage over other approaches like partial dependency plots (PDP) and SHapley Additive exPlanations (SHAP): its values represent a clean functional decomposition of the model. As such, ALE values are not affected by the presence or absence of interactions among variables in a mode. Moreover, its computation is relatively rapid. This package reimplements the algorithms for calculating ALE data and develops highly interpretable visualizations for plotting these ALE values. It also extends the original ALE concept to add bootstrap-based confidence intervals and ALE-based statistics that can be used for statistical inference. For more details, see Okoli, Chitu. 2023. “Statistical Inference Using Machine Learning and Classical Techniques Based on Accumulated Local Effects (ALE).” arXiv. <arXiv:2310.09877>. <doi:10.48550/arXiv.2310.09877>.

Maintained by Chitu Okoli. Last updated 27 days ago.

3.2 match 4 stars 6.41 score 27 scripts

marce10

warbleR:Streamline Bioacoustic Analysis

Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.

Maintained by Marcelo Araya-Salas. Last updated 2 months ago.

animal-acoustic-signals audio-processing bioacoustics spectrogram streamline-analysis cpp

1.9 match 56 stars 10.86 score 270 scripts 4 dependents

lvclark

polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids

Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.

Maintained by Lindsay V. Clark. Last updated 10 days ago.

bioinformatics dna-sequencing genotype-likelihoods genotyping-by-sequencing hacktoberfest rad-seq rad-sequencing snp-genotyping cpp

2.9 match 28 stars 6.98 score 85 scripts

xfim

ggmcmc:Tools for Analyzing MCMC Simulations from Bayesian Inference

Tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables, and functions to work with hierarchical/multilevel batches of parameters (Fernández-i-Marín, 2016 <doi:10.18637/jss.v070.i09>).

Maintained by Xavier Fernández i Marín. Last updated 2 years ago.

bayesian-data-analysis ggplot2 graphical jags mcmc stan

1.7 match 112 stars 12.02 score 1.6k scripts 8 dependents

mariarizzo

energy:E-Statistics: Multivariate Inference via the Energy of Data

E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, k-groups and hierarchical clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.

Maintained by Maria Rizzo. Last updated 7 months ago.

distance-correlation energy multivariate-analysis statistics cpp

1.9 match 45 stars 10.60 score 634 scripts 45 dependents

bioc

QSutils:Quasispecies Diversity

Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and exploration—functions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indices—functions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulation—functions useful for generating random viral quasispecies data.

Maintained by Mercedes Guerrero-Murillo. Last updated 5 months ago.

software genetics dnaseq geneticvariability sequencing alignment sequencematching dataimport

3.6 match 5.56 score 8 scripts 1 dependents

hadley

reshape:Flexibly Reshape Data

Flexibly restructure and aggregate data using just two functions: melt and cast.

Maintained by Hadley Wickham. Last updated 3 years ago.

2.0 match 9.83 score 21k scripts 231 dependents