Showing 200 of total 3905 results (show query)
tidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 26 days ago.
4.8k stars 24.68 score 659k scripts 7.8k dependentstidyverse
tidyr:Tidy Messy Data
Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).
Maintained by Hadley Wickham. Last updated 26 days ago.
1.4k stars 22.88 score 168k scripts 5.5k dependentsrcppcore
Rcpp:Seamless R and C++ Integration
The 'Rcpp' package provides R functions as well as C++ classes which offer a seamless integration of R and C++. Many R data types and objects can be mapped back and forth to C++ equivalents which facilitates both writing of new code as well as easier integration of third-party libraries. Documentation about 'Rcpp' is provided by several vignettes included in this package, via the 'Rcpp Gallery' site at <https://gallery.rcpp.org>, the paper by Eddelbuettel and Francois (2011, <doi:10.18637/jss.v040.i08>), the book by Eddelbuettel (2013, <doi:10.1007/978-1-4614-6868-4>) and the paper by Eddelbuettel and Balamuta (2018, <doi:10.1080/00031305.2017.1375990>); see 'citation("Rcpp")' for details.
Maintained by Dirk Eddelbuettel. Last updated 11 hours ago.
c-plus-plusc-plus-plus-11c-plus-plus-14c-plus-plus-17c-plus-plus-20rcppcpp
755 stars 22.63 score 11k scripts 13k dependentsr-spatial
sf:Simple Features for R
Support for simple feature access, a standardized way to encode and analyze spatial vector data. Binds to 'GDAL' <doi:10.5281/zenodo.5884351> for reading and writing data, to 'GEOS' <doi:10.5281/zenodo.11396894> for geometrical operations, and to 'PROJ' <doi:10.5281/zenodo.5884394> for projection conversions and datum transformations. Uses by default the 's2' package for geometry operations on geodetic (long/lat degree) coordinates.
Maintained by Edzer Pebesma. Last updated 3 days ago.
1.4k stars 22.44 score 117k scripts 1.2k dependentsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 5 days ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
584 stars 21.13 score 31k scripts 1.9k dependentsrstudio
reticulate:Interface to 'Python'
Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.
Maintained by Tomasz Kalinowski. Last updated 4 days ago.
1.7k stars 21.02 score 18k scripts 434 dependentsr-lib
testthat:Unit Testing for R
Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.
Maintained by Hadley Wickham. Last updated 29 days ago.
900 stars 20.97 score 74k scripts 465 dependentstidyverse
readxl:Read Excel Files
Import excel files into R. Supports '.xls' via the embedded 'libxls' C library <https://github.com/libxls/libxls> and '.xlsx' via the embedded 'RapidXML' C++ library <https://rapidxml.sourceforge.net/>. Works on Windows, Mac and Linux without external dependencies.
Maintained by Jennifer Bryan. Last updated 22 days ago.
734 stars 20.85 score 160k scripts 815 dependentslme4
lme4:Linear Mixed-Effects Models using 'Eigen' and S4
Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".
Maintained by Ben Bolker. Last updated 4 days ago.
647 stars 20.68 score 35k scripts 1.5k dependentsr-lib
fs:Cross-Platform File System Operations Based on 'libuv'
A cross-platform interface to file system operations, built on top of the 'libuv' C library.
Maintained by Gábor Csárdi. Last updated 5 months ago.
370 stars 20.26 score 8.1k scripts 5.2k dependentstidyverse
readr:Read Rectangular Text Data
The goal of 'readr' is to provide a fast and friendly way to read rectangular data (like 'csv', 'tsv', and 'fwf'). It is designed to flexibly parse many types of data found in the wild, while still cleanly failing when data unexpectedly changes.
Maintained by Jennifer Bryan. Last updated 8 months ago.
1.0k stars 20.06 score 132k scripts 2.1k dependentsapache
arrow:Integration to 'Apache' 'Arrow'
'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.
Maintained by Jonathan Keane. Last updated 2 months ago.
15k stars 19.25 score 10k scripts 82 dependentsslowkow
ggrepel:Automatically Position Non-Overlapping Text Labels with 'ggplot2'
Provides text and label geoms for 'ggplot2' that help to avoid overlapping text labels. Labels repel away from each other and away from the data points.
Maintained by Kamil Slowikowski. Last updated 5 months ago.
1.2k stars 19.20 score 37k scripts 1.2k dependentsycphs
openxlsx:Read, Write and Edit xlsx Files
Simplifies the creation of Excel .xlsx files by providing a high level interface to writing, styling and editing worksheets. Through the use of 'Rcpp', read/write times are comparable to the 'xlsx' and 'XLConnect' packages with the added benefit of removing the dependency on Java.
Maintained by Jan Marvin Garbuszus. Last updated 2 months ago.
232 stars 19.09 score 20k scripts 277 dependentsstan-dev
rstan:R Interface to Stan
User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.
Maintained by Ben Goodrich. Last updated 12 hours ago.
bayesian-data-analysisbayesian-inferencebayesian-statisticsmcmcstancpp
1.1k stars 18.86 score 14k scripts 281 dependentsrcppcore
RcppArmadillo:'Rcpp' Integration for the 'Armadillo' Templated Linear Algebra Library
'Armadillo' is a templated C++ linear algebra library (by Conrad Sanderson) that aims towards a good balance between speed and ease of use. Integer, floating point and complex numbers are supported, as well as a subset of trigonometric and statistics functions. Various matrix decompositions are provided through optional integration with LAPACK and ATLAS libraries. The 'RcppArmadillo' package includes the header files from the templated 'Armadillo' library. Thus users do not need to install 'Armadillo' itself in order to use 'RcppArmadillo'. From release 7.800.0 on, 'Armadillo' is licensed under Apache License 2; previous releases were under licensed as MPL 2.0 from version 3.800.0 onwards and LGPL-3 prior to that; 'RcppArmadillo' (the 'Rcpp' bindings/bridge to Armadillo) is licensed under the GNU GPL version 2 or later, as is the rest of 'Rcpp'.
Maintained by Dirk Eddelbuettel. Last updated 23 hours ago.
armadilloc-plus-plusrcpprcpparmadilloopenblascppopenmp
200 stars 18.85 score 1.9k scripts 3.4k dependentsr-dbi
RSQLite:SQLite Interface for R
Embeds the SQLite database engine in R and provides an interface compliant with the DBI package. The source for the SQLite engine and for various extensions in a recent version is included. System libraries will never be consulted because this package relies on static linking for the plugins it includes; this also ensures a consistent experience across all installations.
Maintained by Kirill Müller. Last updated 1 hours ago.
331 stars 18.78 score 8.1k scripts 1.1k dependentstidyverse
haven:Import and Export 'SPSS', 'Stata' and 'SAS' Files
Import foreign statistical formats into R via the embedded 'ReadStat' C library, <https://github.com/WizardMac/ReadStat>.
Maintained by Hadley Wickham. Last updated 6 months ago.
427 stars 18.63 score 18k scripts 682 dependentsr-lib
xml2:Parse XML
Bindings to 'libxml2' for working with XML data using a simple, consistent interface based on 'XPath' expressions. Also supports XML schema validation; for 'XSLT' transformations see the 'xslt' package.
Maintained by Jeroen Ooms. Last updated 16 days ago.
220 stars 18.52 score 6.3k scripts 2.3k dependentsr-lib
roxygen2:In-Line Documentation for R
Generate your Rd documentation, 'NAMESPACE' file, and collation field using specially formatted comments. Writing documentation in-line with code makes it easier to keep your documentation up-to-date as your requirements change. 'roxygen2' is inspired by the 'Doxygen' system for C++.
Maintained by Hadley Wickham. Last updated 8 months ago.
606 stars 18.51 score 2.3k scripts 219 dependentsgagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 2 days ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
307 stars 18.42 score 10k scripts 8.7k dependentshadley
plyr:Tools for Splitting, Applying and Combining Data
A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.
Maintained by Hadley Wickham. Last updated 5 months ago.
500 stars 18.16 score 83k scripts 3.3k dependentstidyverse
vroom:Read and Write Rectangular Text Data Quickly
The goal of 'vroom' is to read and write data (like 'csv', 'tsv' and 'fwf') quickly. When reading it uses a quick initial indexing step, then reads the values lazily , so only the data you actually use needs to be read. The writer formats the data in parallel and writes to disk asynchronously from formatting.
Maintained by Jennifer Bryan. Last updated 7 months ago.
csvcsv-parserfixed-width-texttsvtsv-parsercpp
625 stars 17.82 score 4.5k scripts 2.1k dependentsr-lib
cpp11:A C++11 Interface for R's C Interface
Provides a header only, C++11 interface to R's C interface. Compared to other approaches 'cpp11' strives to be safe against long jumps from the C API as well as C++ exceptions, conform to normal R function semantics and supports interaction with 'ALTREP' vectors.
Maintained by Davis Vaughan. Last updated 25 days ago.
212 stars 17.69 score 104 scripts 8.6k dependentsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 7 hours ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
560 stars 17.65 score 17k scripts 856 dependentsrobjhyndman
forecast:Forecasting Functions for Time Series and Linear Models
Methods and tools for displaying and analysing univariate time series forecasts including exponential smoothing via state space models and automatic ARIMA modelling.
Maintained by Rob Hyndman. Last updated 7 months ago.
forecastforecastingopenblascpp
1.1k stars 17.46 score 16k scripts 240 dependentsdmlc
xgboost:Extreme Gradient Boosting
Extreme Gradient Boosting, which is an efficient implementation of the gradient boosting framework from Chen & Guestrin (2016) <doi:10.1145/2939672.2939785>. This package is its R interface. The package includes efficient linear model solver and tree learning algorithms. The package can automatically do parallel computation on a single machine which could be more than 10 times faster than existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is made to be extensible, so that users are also allowed to define their own objectives easily.
Maintained by Jiaming Yuan. Last updated 2 days ago.
distributed-systemsgbdtgbmgbrtmachine-learningxgboostcppopenmp
27k stars 17.45 score 115 dependentsdmurdoch
rgl:3D Visualization Using OpenGL
Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.). Output may be on screen using OpenGL, or to various standard 3D file formats including WebGL, PLY, OBJ, STL as well as 2D image formats, including PNG, Postscript, SVG, PGF.
Maintained by Duncan Murdoch. Last updated 17 hours ago.
graphicsopenglrglwebgllibglulibglvndlibpnglibx11freetypecpp
91 stars 17.40 score 7.3k scripts 303 dependentsropensci
magick:Advanced Graphics and Image-Processing in R
Bindings to 'ImageMagick': the most comprehensive open-source image processing library available. Supports many common formats (png, jpeg, tiff, pdf, etc) and manipulations (rotate, scale, crop, trim, flip, blur, etc). All operations are vectorized via the Magick++ STL meaning they operate either on a single frame or a series of frames for working with layers, collages, or animation. In RStudio images are automatically previewed when printed to the console, resulting in an interactive editing environment. The latest version of the package includes a native graphics device for creating in-memory graphics or drawing onto images using pixel coordinates.
Maintained by Jeroen Ooms. Last updated 6 days ago.
image-manipulationimage-processingimagemagickcpp
467 stars 17.38 score 9.0k scripts 258 dependentsbioc
BiocParallel:Bioconductor facilities for parallel evaluation
This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.
Maintained by Martin Morgan. Last updated 1 months ago.
infrastructurebioconductor-packagecore-packageu24ca289073cpp
67 stars 17.31 score 7.3k scripts 1.1k dependentsr-quantities
units:Measurement Units for R Vectors
Support for measurement units in R vectors, matrices and arrays: automatic propagation, conversion, derivation and simplification of units; raising errors in case of unit incompatibility. Compatible with the POSIXct, Date and difftime classes. Uses the UNIDATA udunits library and unit database for unit compatibility checking and conversion. Documentation about 'units' is provided in the paper by Pebesma, Mailund & Hiebert (2016, <doi:10.32614/RJ-2016-061>), included in this package as a vignette; see 'citation("units")' for details.
Maintained by Edzer Pebesma. Last updated 16 days ago.
181 stars 17.28 score 3.3k scripts 1.2k dependentsemmanuelparadis
ape:Analyses of Phylogenetics and Evolution
Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.
Maintained by Emmanuel Paradis. Last updated 5 days ago.
64 stars 17.27 score 13k scripts 599 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.
Maintained by Robert J. Hijmans. Last updated 1 days ago.
163 stars 17.23 score 58k scripts 562 dependentshadley
reshape2:Flexibly Reshape Data: A Reboot of the Reshape Package
Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast').
Maintained by Hadley Wickham. Last updated 4 years ago.
210 stars 17.19 score 94k scripts 2.0k dependentsastamm
nloptr:R Interface to NLopt
Solve optimization problems using an R interface to NLopt. NLopt is a free/open-source library for nonlinear optimization, providing a common interface for a number of different free optimization routines available online as well as original implementations of various other algorithms. See <https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/> for more information on the available algorithms. Building from included sources requires 'CMake'. On Linux and 'macOS', if a suitable system build of NLopt (2.7.0 or later) is found, it is used; otherwise, it is built from included sources via 'CMake'. On Windows, NLopt is obtained through 'rwinlib' for 'R <= 4.1.x' or grabbed from the appropriate toolchain for 'R >= 4.2.0'.
Maintained by Aymeric Stamm. Last updated 12 days ago.
107 stars 17.17 score 1.1k scripts 1.8k dependentsrstudio
promises:Abstractions for Promise-Based Asynchronous Programming
Provides fundamental abstractions for doing asynchronous programming in R using promises. Asynchronous programming is useful for allowing a single R process to orchestrate multiple tasks in the background while also attending to something else. Semantics are similar to 'JavaScript' promises, but with a syntax that is idiomatic R.
Maintained by Joe Cheng. Last updated 2 months ago.
204 stars 17.10 score 688 scripts 2.6k dependentsthomasp85
ggraph:An Implementation of Grammar of Graphics for Graphs and Networks
The grammar of graphics as implemented in ggplot2 is a poor fit for graph and network visualizations due to its reliance on tabular data input. ggraph is an extension of the ggplot2 API tailored to graph visualizations and provides the same flexible approach to building up plots layer by layer.
Maintained by Thomas Lin Pedersen. Last updated 1 years ago.
ggplot-extensionggplot2graph-visualizationnetwork-visualizationvisualizationcpp
1.1k stars 16.96 score 9.2k scripts 111 dependentssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
2.4k stars 16.86 score 50k scripts 73 dependentsglmmtmb
glmmTMB:Generalized Linear Mixed Models using Template Model Builder
Fit linear and generalized linear mixed models with various extensions, including zero-inflation. The models are fitted using maximum likelihood estimation via 'TMB' (Template Model Builder). Random effects are assumed to be Gaussian on the scale of the linear predictor and are integrated out using the Laplace approximation. Gradients are calculated using automatic differentiation.
Maintained by Mollie Brooks. Last updated 9 hours ago.
314 stars 16.85 score 3.7k scripts 25 dependentsklausvigo
phangorn:Phylogenetic Reconstruction and Analysis
Allows for estimation of phylogenetic trees and networks using Maximum Likelihood, Maximum Parsimony, distance methods and Hadamard conjugation (Schliep 2011). Offers methods for tree comparison, model selection and visualization of phylogenetic networks as described in Schliep et al. (2017).
Maintained by Klaus Schliep. Last updated 20 hours ago.
softwaretechnologyqualitycontrolphylogenetic-analysisphylogeneticsopenblascpp
206 stars 16.70 score 2.5k scripts 135 dependentssebkrantz
collapse:Advanced and Fast Data Transformation
A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.
Maintained by Sebastian Krantz. Last updated 7 days ago.
data-aggregationdata-analysisdata-manipulationdata-processingdata-sciencedata-transformationeconometricshigh-performancepanel-datascientific-computingstatisticstime-seriesweightedweightscppopenmp
672 stars 16.68 score 708 scripts 99 dependentsquanteda
quanteda:Quantitative Analysis of Textual Data
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Maintained by Kenneth Benoit. Last updated 3 months ago.
corpusnatural-language-processingquantedatext-analyticsonetbbcpp
851 stars 16.65 score 5.4k scripts 52 dependentsamices
mice:Multivariate Imputation by Chained Equations
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
Maintained by Stef van Buuren. Last updated 19 hours ago.
chained-equationsfcsimputationmicemissing-datamissing-valuesmultiple-imputationmultivariate-datacpp
462 stars 16.64 score 10k scripts 154 dependentsrapporter
pander:An R 'Pandoc' Writer
Contains some functions catching all messages, 'stdout' and other useful information while evaluating R code and other helpers to return user specified text elements (like: header, paragraph, table, image, lists etc.) in 'pandoc' markdown or several type of R objects similarly automatically transformed to markdown format. Also capable of exporting/converting (the resulting) complex 'pandoc' documents to e.g. HTML, 'PDF', 'docx' or 'odt'. This latter reporting feature is supported in brew syntax or with a custom reference class with a smarty caching 'backend'.
Maintained by Gergely Daróczi. Last updated 28 days ago.
literate-programmingmarkdownpandocpandoc-markdownreproducible-researchrmarkdowncpp
297 stars 16.60 score 7.6k scripts 108 dependentsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 3 days ago.
521 stars 16.50 score 1.4k scripts 39 dependentsbioc
fgsea:Fast Gene Set Enrichment Analysis
The package implements an algorithm for fast gene set enrichment analysis. Using the fast algorithm allows to make more permutations and get more fine grained p-values, which allows to use accurate stantard approaches to multiple hypothesis correction.
Maintained by Alexey Sergushichev. Last updated 9 days ago.
geneexpressiondifferentialexpressiongenesetenrichmentpathwayscpp
392 stars 16.31 score 3.9k scripts 101 dependentsr-dbi
odbc:Connect to ODBC Compatible Databases (using the DBI Interface)
A DBI-compatible interface to ODBC databases.
Maintained by Hadley Wickham. Last updated 2 days ago.
396 stars 16.31 score 2.9k scripts 23 dependentsimbs-hl
ranger:A Fast Implementation of Random Forests
A fast implementation of Random Forests, particularly suited for high dimensional data. Ensembles of classification, regression, survival and probability prediction trees are supported. Data from genome-wide association studies can be analyzed efficiently. In addition to data frames, datasets of class 'gwaa.data' (R package 'GenABEL') and 'dgCMatrix' (R package 'Matrix') can be directly analyzed.
Maintained by Marvin N. Wright. Last updated 5 months ago.
783 stars 16.22 score 9.2k scripts 189 dependentsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 24 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
375 stars 16.11 score 17k scripts 115 dependentsjlmelville
uwot:The Uniform Manifold Approximation and Projection (UMAP) Method for Dimensionality Reduction
An implementation of the Uniform Manifold Approximation and Projection dimensionality reduction by McInnes et al. (2018) <doi:10.48550/arXiv.1802.03426>. It also provides means to transform new data and to carry out supervised dimensionality reduction. An implementation of the related LargeVis method of Tang et al. (2016) <doi:10.48550/arXiv.1602.00370> is also provided. This is a complete re-implementation in R (and C++, via the 'Rcpp' package): no Python installation is required. See the uwot website (<https://github.com/jlmelville/uwot>) for more documentation and examples.
Maintained by James Melville. Last updated 3 days ago.
dimensionality-reductionumapcpp
329 stars 16.08 score 2.0k scripts 145 dependentsthomasp85
ggforce:Accelerating 'ggplot2'
The aim of 'ggplot2' is to aid in visual data investigations. This focus has led to a lack of facilities for composing specialised plots. 'ggforce' aims to be a collection of mainly new stats and geoms that fills this gap. All additional functionality is aimed to come through the official extension system so using 'ggforce' should be a stable experience.
Maintained by Thomas Lin Pedersen. Last updated 4 days ago.
ggplot-extensionggplot2visualizationcpp
929 stars 15.98 score 9.3k scripts 298 dependentsbioc
rhdf5:R Interface to HDF5
This package provides an interface between HDF5 and R. HDF5's main features are the ability to store and access very large and/or complex datasets and a wide variety of metadata on mass storage (disk) through a completely portable file format. The rhdf5 package is thus suited for the exchange of large and/or complex datasets between R and other software package, and for letting R applications work on datasets that are larger than the available RAM.
Maintained by Mike Smith. Last updated 5 days ago.
infrastructuredataimporthdf5rhdf5opensslcurlzlibcpp
62 stars 15.87 score 4.2k scripts 232 dependentsr-lib
later:Utilities for Scheduling Functions to Execute Later with Event Loops
Executes arbitrary R or C functions some time after the current time, after the R execution stack has emptied. The functions are scheduled in an event loop.
Maintained by Winston Chang. Last updated 2 months ago.
143 stars 15.86 score 234 scripts 2.6k dependentsjeroen
V8:Embedded JavaScript and WebAssembly Engine for R
An R interface to V8 <https://v8.dev>: Google's open source JavaScript and WebAssembly engine. This package can be compiled either with V8 version 6 and up or NodeJS when built as a shared library.
Maintained by Jeroen Ooms. Last updated 6 days ago.
201 stars 15.81 score 508 scripts 337 dependentsstan-dev
rstanarm:Bayesian Applied Regression Modeling via Stan
Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.
Maintained by Ben Goodrich. Last updated 10 days ago.
bayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticsmultilevel-modelsrstanrstanarmstanstatistical-modelingcpp
393 stars 15.70 score 5.0k scripts 13 dependentsrcppcore
RcppEigen:'Rcpp' Integration for the 'Eigen' Templated Linear Algebra Library
R and 'Eigen' integration using 'Rcpp'. 'Eigen' is a C++ template library for linear algebra: matrices, vectors, numerical solvers and related algorithms. It supports dense and sparse matrices on integer, floating point and complex numbers, decompositions of such matrices, and solutions of linear systems. Its performance on many algorithms is comparable with some of the best implementations based on 'Lapack' and level-3 'BLAS'. The 'RcppEigen' package includes the header files from the 'Eigen' C++ template library. Thus users do not need to install 'Eigen' itself in order to use 'RcppEigen'. Since version 3.1.1, 'Eigen' is licensed under the Mozilla Public License (version 2); earlier version were licensed under the GNU LGPL version 3 or later. 'RcppEigen' (the 'Rcpp' bindings/bridge to 'Eigen') is licensed under the GNU GPL version 2 or later, as is the rest of 'Rcpp'.
Maintained by Dirk Eddelbuettel. Last updated 7 months ago.
algorithmc-plus-pluseigeneigen-libraryopenblascpp
114 stars 15.66 score 356 scripts 3.8k dependentsr-lib
systemfonts:System Native Font Finding
Provides system native access to the font catalogue. As font handling varies between systems it is difficult to correctly locate installed fonts across different operating systems. The 'systemfonts' package provides bindings to the native libraries on Windows, macOS and Linux for finding font files that can then be used further by e.g. graphic devices. The main use is intended to be from compiled code but 'systemfonts' also provides access from R.
Maintained by Thomas Lin Pedersen. Last updated 2 months ago.
95 stars 15.62 score 384 scripts 990 dependentsmhahsler
dbscan:Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Related Algorithms
A fast reimplementation of several density-based algorithms of the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based spatial clustering of applications with noise) and HDBSCAN (hierarchical DBSCAN), the ordering algorithm OPTICS (ordering points to identify the clustering structure), shared nearest neighbor clustering, and the outlier detection algorithms LOF (local outlier factor) and GLOSH (global-local outlier score from hierarchies). The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided. Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>.
Maintained by Michael Hahsler. Last updated 2 months ago.
clusteringdbscandensity-based-clusteringhdbscanlofopticscpp
324 stars 15.60 score 1.6k scripts 85 dependentsprophet:Automatic Forecasting Procedure
Implements a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
Maintained by Sean Taylor. Last updated 5 months ago.
19k stars 15.59 score 976 scripts 13 dependentsr-lib
svglite:An 'SVG' Graphics Device
A graphics device for R that produces 'Scalable Vector Graphics'. 'svglite' is a fork of the older 'RSvgDevice' package.
Maintained by Thomas Lin Pedersen. Last updated 5 months ago.
181 stars 15.57 score 4.7k scripts 228 dependentsrstudio
httpuv:HTTP and WebSocket Server Library
Provides low-level socket and protocol support for handling HTTP and WebSocket requests directly from within R. It is primarily intended as a building block for other packages, rather than making it particularly easy to create complete web applications using httpuv alone. httpuv is built on top of the libuv and http-parser C libraries, both of which were developed by Joyent, Inc. (See LICENSE file for libuv and http-parser license information.)
Maintained by Winston Chang. Last updated 10 days ago.
236 stars 15.42 score 708 scripts 2.1k dependentsbioc
Rsamtools:Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import
This package provides an interface to the 'samtools', 'bcftools', and 'tabix' utilities for manipulating SAM (Sequence Alignment / Map), FASTA, binary variant call (BCF) and compressed indexed tab-delimited (tabix) files.
Maintained by Bioconductor Package Maintainer. Last updated 4 months ago.
dataimportsequencingcoveragealignmentqualitycontrolbioconductor-packagecore-packagecurlbzip2xz-utilszlibcpp
28 stars 15.34 score 3.2k scripts 569 dependentsdankelley
oce:Analysis of Oceanographic Data
Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.
Maintained by Dan Kelley. Last updated 2 days ago.
146 stars 15.34 score 4.2k scripts 18 dependentsrstudio
sass:Syntactically Awesome Style Sheets ('Sass')
An 'SCSS' compiler, powered by the 'LibSass' library. With this, R developers can use variables, inheritance, and functions to generate dynamic style sheets. The package uses the 'Sass CSS' extension language, which is stable, powerful, and CSS compatible.
Maintained by Carson Sievert. Last updated 11 months ago.
100 stars 15.30 score 252 scripts 4.3k dependentsxrobin
pROC:Display and Analyze ROC Curves
Tools for visualizing, smoothing and comparing receiver operating characteristic (ROC curves). (Partial) area under the curve (AUC) can be compared with statistical tests based on U-statistics or bootstrap. Confidence intervals can be computed for (p)AUC or ROC curves.
Maintained by Xavier Robin. Last updated 5 months ago.
bootstrappingcovariancehypothesis-testingmachine-learningplotplottingrocroc-curvevariancecpp
125 stars 15.18 score 16k scripts 445 dependentsr-lib
ragg:Graphic Devices Based on AGG
Anti-Grain Geometry (AGG) is a high-quality and high-performance 2D drawing library. The 'ragg' package provides a set of graphic devices based on AGG to use as alternative to the raster devices provided through the 'grDevices' package.
Maintained by Thomas Lin Pedersen. Last updated 11 days ago.
drawinggraphicsvector-graphicsfreetypelibpngtifflibjpeg-turbocpp
175 stars 15.18 score 1.8k scripts 485 dependentstrevorhastie
glmnet:Lasso and Elastic-Net Regularized Generalized Linear Models
Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.
Maintained by Trevor Hastie. Last updated 2 years ago.
82 stars 15.15 score 22k scripts 736 dependentsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurélie Siberchicot. Last updated 9 days ago.
40 stars 15.10 score 2.2k scripts 257 dependentskosukeimai
MatchIt:Nonparametric Preprocessing for Parametric Causal Inference
Selects matched samples of the original treated and control groups with similar covariate distributions -- can be used to match exactly on covariates, to match on propensity scores, or perform a variety of other matching procedures. The package also implements a series of recommendations offered in Ho, Imai, King, and Stuart (2007) <DOI:10.1093/pan/mpl013>. (The 'gurobi' package, which is not on CRAN, is optional and comes with an installation of the Gurobi Optimizer, available at <https://www.gurobi.com>.)
Maintained by Noah Greifer. Last updated 14 days ago.
220 stars 15.03 score 2.4k scripts 21 dependentsr-lib
isoband:Generate Isolines and Isobands from Regularly Spaced Elevation Grids
A fast C++ implementation to generate contour lines (isolines) and contour polygons (isobands) from regularly spaced grids containing elevation data.
Maintained by Hadley Wickham. Last updated 2 years ago.
132 stars 15.01 score 75 scripts 7.6k dependentsmjskay
ggdist:Visualizations of Distributions and Uncertainty
Provides primitives for visualizing distributions using 'ggplot2' that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized. Visualization primitives include but are not limited to: points with multiple uncertainty intervals, eye plots (Spiegelhalter D., 1999) <https://ideas.repec.org/a/bla/jorssa/v162y1999i1p45-58.html>, density plots, gradient plots, dot plots (Wilkinson L., 1999) <doi:10.1080/00031305.1999.10474474>, quantile dot plots (Kay M., Kola T., Hullman J., Munson S., 2016) <doi:10.1145/2858036.2858558>, complementary cumulative distribution function barplots (Fernandes M., Walls L., Munson S., Hullman J., Kay M., 2018) <doi:10.1145/3173574.3173718>, and fit curves with multiple uncertainty ribbons.
Maintained by Matthew Kay. Last updated 4 months ago.
ggplot2uncertaintyuncertainty-visualizationvisualizationcpp
859 stars 14.95 score 3.1k scripts 62 dependentsphilchalmers
mirt:Multidimensional Item Response Theory
Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.
Maintained by Phil Chalmers. Last updated 2 days ago.
212 stars 14.93 score 2.5k scripts 40 dependentsrcppcore
RcppParallel:Parallel Programming Tools for 'Rcpp'
High level functions for parallel programming with 'Rcpp'. For example, the 'parallelFor()' function can be used to convert the work of a standard serial "for" loop into a parallel one and the 'parallelReduce()' function can be used for accumulating aggregate or other values.
Maintained by Kevin Ushey. Last updated 10 days ago.
174 stars 14.89 score 215 scripts 800 dependentsr-dbi
RPostgres:C++ Interface to PostgreSQL
Fully DBI-compliant C++-backed interface to PostgreSQL <https://www.postgresql.org/>, an open-source relational database.
Maintained by Kirill Müller. Last updated 1 months ago.
338 stars 14.78 score 1.6k scripts 31 dependentsthomasp85
tidygraph:A Tidy API for Graph Manipulation
A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.
Maintained by Thomas Lin Pedersen. Last updated 2 months ago.
graph-algorithmsgraph-manipulationigraphnetwork-analysistidyversecpp
553 stars 14.74 score 4.6k scripts 136 dependentslrberge
fixest:Fast Fixed-Effects Estimations
Fast and user-friendly estimation of econometric models with multiple fixed-effects. Includes ordinary least squares (OLS), generalized linear models (GLM) and the negative binomial. The core of the package is based on optimized parallel C++ code, scaling especially well for large data sets. The method to obtain the fixed-effects coefficients is based on Berge (2018) <https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf>. Further provides tools to export and view the results of several estimations with intuitive design to cluster the standard-errors.
Maintained by Laurent Berge. Last updated 7 months ago.
394 stars 14.69 score 3.8k scripts 26 dependentsvincentarelbundock
marginaleffects:Predictions, Comparisons, Slopes, Marginal Means, and Hypothesis Tests
Compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds, etc.) for over 100 classes of statistical and machine learning models in R. Conduct linear and non-linear hypothesis tests, or equivalence tests. Calculate uncertainty estimates using the delta method, bootstrapping, or simulation-based inference. Details can be found in Arel-Bundock, Greifer, and Heiss (2024) <doi:10.18637/jss.v111.i09>.
Maintained by Vincent Arel-Bundock. Last updated 3 days ago.
509 stars 14.56 score 1.8k scripts 10 dependentsalexcb
rjson:JSON for R
Converts R object into JSON objects and vice-versa.
Maintained by Alex Couture-Beil. Last updated 6 months ago.
21 stars 14.54 score 7.4k scripts 852 dependentsr-lib
clock:Date-Time Types and Tools
Provides a comprehensive library for date-time manipulations using a new family of orthogonal date-time classes (durations, time points, zoned-times, and calendars) that partition responsibilities so that the complexities of time zones are only considered when they are really needed. Capabilities include: date-time parsing, formatting, arithmetic, extraction and updating of components, and rounding.
Maintained by Davis Vaughan. Last updated 12 days ago.
106 stars 14.53 score 296 scripts 407 dependentsropensci
osmdata:Import 'OpenStreetMap' Data as Simple Features or Spatial Objects
Download and import of 'OpenStreetMap' ('OSM') data as 'sf' or 'sp' objects. 'OSM' data are extracted from the 'Overpass' web server (<https://overpass-api.de/>) and processed with very fast 'C++' routines for return to 'R'.
Maintained by Mark Padgham. Last updated 1 months ago.
open0street0mapopenstreetmapoverpass0apiosmcpposm-dataoverpass-apipeer-reviewedcpp
322 stars 14.53 score 2.8k scripts 14 dependentsr-lidar
lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications
Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.
Maintained by Jean-Romain Roussel. Last updated 2 months ago.
alsforestrylaslazlidarpoint-cloudremote-sensingopenblascppopenmp
623 stars 14.47 score 844 scripts 8 dependentsstatistikat
VIM:Visualization and Imputation of Missing Values
New tools for the visualization of missing and/or imputed values are introduced, which can be used for exploring the data and the structure of the missing and/or imputed values. Depending on this structure of the missing values, the corresponding methods may help to identify the mechanism generating the missing values and allows to explore the data including missing values. In addition, the quality of imputation can be visually explored using various univariate, bivariate, multiple and multivariate plot methods. A graphical user interface available in the separate package VIMGUI allows an easy handling of the implemented plot methods.
Maintained by Matthias Templ. Last updated 8 months ago.
hotdeckimputation-methodsmodel-predictionsvisualizationcpp
85 stars 14.44 score 2.6k scripts 19 dependentsdavidgohel
ggiraph:Make 'ggplot2' Graphics Interactive
Create interactive 'ggplot2' graphics using 'htmlwidgets'.
Maintained by David Gohel. Last updated 1 days ago.
822 stars 14.37 score 4.1k scripts 35 dependentsbioc
xcms:LC-MS and GC-MS Data Analysis
Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.
Maintained by Steffen Neumann. Last updated 15 days ago.
immunooncologymassspectrometrymetabolomicsbioconductorfeature-detectionmass-spectrometrypeak-detectioncpp
196 stars 14.31 score 984 scripts 11 dependentsthomasp85
farver:High Performance Colour Space Manipulation
The encoding of colour can be handled in many different ways, using different colour spaces. As different colour spaces have different uses, efficient conversion between these representations are important. The 'farver' package provides a set of functions that gives access to very fast colour space conversion and comparisons implemented in C++, and offers speed improvements over the 'convertColor' function in the 'grDevices' package.
Maintained by Thomas Lin Pedersen. Last updated 11 months ago.
136 stars 14.22 score 164 scripts 8.0k dependentsbioc
GOSemSim:GO-terms Semantic Similarity Measures
The semantic comparisons of Gene Ontology (GO) annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products and gene clusters. GOSemSim implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationgoclusteringpathwaysnetworksoftwarebioinformaticsgene-ontologysemantic-similaritycpp
63 stars 14.12 score 708 scripts 68 dependentsqsbase
qs:Quick Serialization of R Objects
Provides functions for quickly writing and reading any R object to and from disk.
Maintained by Travers Ching. Last updated 8 days ago.
compressiondata-storageencodingserializationlibzstdlz4cpp
417 stars 14.05 score 2.5k scripts 51 dependentsr-lib
fastmap:Fast Data Structures
Fast implementation of data structures, including a key-value store, stack, and queue. Environments are commonly used as key-value stores in R, but every time a new key is used, it is added to R's global symbol table, causing a small amount of memory leakage. This can be problematic in cases where many different keys are used. Fastmap avoids this memory leak issue by implementing the map using data structures in C++.
Maintained by Winston Chang. Last updated 11 months ago.
134 stars 14.04 score 102 scripts 5.5k dependentsbertcarnell
lhs:Latin Hypercube Samples
Provides a number of methods for creating and augmenting Latin Hypercube Samples and Orthogonal Array Latin Hypercube Samples.
Maintained by Rob Carnell. Last updated 9 months ago.
latin-hypercubelatin-hypercube-samplelatin-hypercube-samplinglhsorthogonal-arrayscpp
45 stars 14.04 score 1.5k scripts 110 dependentsjkrijthe
Rtsne:T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation
An R wrapper around the fast T-distributed Stochastic Neighbor Embedding implementation by Van der Maaten (see <https://github.com/lvdmaaten/bhtsne/> for more information on the original implementation).
Maintained by Jesse Krijthe. Last updated 10 months ago.
256 stars 14.01 score 4.4k scripts 233 dependentsr-forge
survey:Analysis of Complex Survey Samples
Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.
Maintained by "Thomas Lumley". Last updated 1 days ago.
1 stars 13.94 score 13k scripts 234 dependentseddelbuettel
anytime:Anything to 'POSIXct' or 'Date' Converter
Convert input in any one of character, integer, numeric, factor, or ordered type into 'POSIXct' (or 'Date') objects, using one of a number of predefined formats, and relying on Boost facilities for date and time parsing.
Maintained by Dirk Eddelbuettel. Last updated 17 days ago.
boostc-plus-plus-11conversionscpp11datedatetimeposixctrcppcpp
165 stars 13.91 score 1.4k scripts 99 dependentsgbm-developers
gbm:Generalized Boosted Regression Models
An implementation of extensions to Freund and Schapire's AdaBoost algorithm and Friedman's gradient boosting machine. Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart). Originally developed by Greg Ridgeway. Newer version available at github.com/gbm-developers/gbm3.
Maintained by Greg Ridgeway. Last updated 9 months ago.
52 stars 13.85 score 6.8k scripts 91 dependentsconfig-i1
smooth:Forecasting Using State Space Models
Functions implementing Single Source of Error state space models for purposes of time series analysis and forecasting. The package includes ADAM (Svetunkov, 2023, <https://openforecast.org/adam/>), Exponential Smoothing (Hyndman et al., 2008, <doi: 10.1007/978-3-540-71918-2>), SARIMA (Svetunkov & Boylan, 2019 <doi: 10.1080/00207543.2019.1600764>), Complex Exponential Smoothing (Svetunkov & Kourentzes, 2018, <doi: 10.13140/RG.2.2.24986.29123>), Simple Moving Average (Svetunkov & Petropoulos, 2018 <doi: 10.1080/00207543.2017.1380326>) and several simulation functions. It also allows dealing with intermittent demand based on the iETS framework (Svetunkov & Boylan, 2019, <doi: 10.13140/RG.2.2.35897.06242>).
Maintained by Ivan Svetunkov. Last updated 12 days ago.
arimaarima-forecastingcesetsexponential-smoothingforecaststate-spacetime-seriesopenblascpp
90 stars 13.83 score 412 scripts 25 dependentsduckdb
duckdb:DBI Package for the DuckDB Database Management System
The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and an R Database Interface (DBI) connector.
Maintained by Kirill Müller. Last updated 10 days ago.
159 stars 13.80 score 1.7k scripts 46 dependentsrspatial
geosphere:Spherical Trigonometry
Spherical trigonometry for geographic applications. That is, compute distances and related measures for angular (longitude/latitude) locations.
Maintained by Robert J. Hijmans. Last updated 6 months ago.
36 stars 13.79 score 5.7k scripts 116 dependentsr-spatial
s2:Spherical Geometry Operators Using the S2 Geometry Library
Provides R bindings for Google's s2 library for geometric calculations on the sphere. High-performance constructors and exporters provide high compatibility with existing spatial packages, transformers construct new geometries from existing geometries, predicates provide a means to select geometries based on spatial relationships, and accessors extract information about geometries.
Maintained by Edzer Pebesma. Last updated 12 days ago.
74 stars 13.76 score 207 scripts 1.2k dependentsimmunogenomics
harmony:Fast, Sensitive, and Accurate Integration of Single Cell Data
Implementation of the Harmony algorithm for single cell integration, described in Korsunsky et al <doi:10.1038/s41592-019-0619-0>. Package includes a standalone Harmony function and interfaces to external frameworks.
Maintained by Ilya Korsunsky. Last updated 5 months ago.
algorithmdata-integrationscrna-seqopenblascpp
554 stars 13.74 score 5.5k scripts 8 dependentsricharddmorey
BayesFactor:Computation of Bayes Factors for Common Designs
A suite of functions for computing various Bayes factors for simple designs, including contingency tables, one- and two-sample designs, one-way designs, general ANOVA designs, and linear regression.
Maintained by Richard D. Morey. Last updated 1 years ago.
132 stars 13.71 score 1.7k scripts 21 dependentsknausb
vcfR:Manipulate and Visualize VCF Data
Facilitates easy manipulation of variant call format (VCF) data. Functions are provided to rapidly read from and write to VCF files. Once VCF data is read into R a parser function extracts matrices of data. This information can then be used for quality control or other purposes. Additional functions provide visualization of genomic data. Once processing is complete data may be written to a VCF file (*.vcf.gz). It also may be converted into other popular R objects (e.g., genlight, DNAbin). VcfR provides a link between VCF data and familiar R software.
Maintained by Brian J. Knaus. Last updated 1 months ago.
genomicspopulation-geneticspopulation-genomicsrcppvcf-datavisualizationzlibcpp
256 stars 13.66 score 3.1k scripts 19 dependentsjanmarvin
openxlsx2:Read, Write and Edit 'xlsx' Files
Simplifies the creation of 'xlsx' files by providing a high level interface to writing, styling and editing worksheets.
Maintained by Jan Marvin Garbuszus. Last updated 11 hours ago.
139 stars 13.64 score 194 scripts 11 dependentsr-lib
textshaping:Bindings to the 'HarfBuzz' and 'Fribidi' Libraries for Text Shaping
Provides access to the text shaping functionality in the 'HarfBuzz' library and the bidirectional algorithm in the 'Fribidi' library. 'textshaping' is a low-level utility package mainly for graphic devices that expands upon the font tool-set provided by the 'systemfonts' package.
Maintained by Thomas Lin Pedersen. Last updated 2 months ago.
19 stars 13.58 score 66 scripts 484 dependentstidyverts
fable:Forecasting Models for Tidy Time Series
Provides a collection of commonly used univariate and multivariate time series forecasting models including automatically selected exponential smoothing (ETS) and autoregressive integrated moving average (ARIMA) models. These models work within the 'fable' framework provided by the 'fabletools' package, which provides the tools to evaluate, visualise, and combine models in a workflow consistent with the tidyverse.
Maintained by Mitchell OHara-Wild. Last updated 4 months ago.
569 stars 13.54 score 2.1k scripts 6 dependentsasgr
imager:Image Processing Library Based on 'CImg'
Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.
Maintained by Aaron Robotham. Last updated 5 days ago.
17 stars 13.53 score 2.4k scripts 44 dependentsdselivanov
text2vec:Modern Text Mining Framework for R
Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines.
Maintained by Dmitriy Selivanov. Last updated 8 months ago.
glovelatent-dirichlet-allocationnatural-language-processingtext-miningtopic-modelingvectorizationword-embeddingsword2veccpp
860 stars 13.48 score 1.3k scripts 23 dependentskkholst
mets:Analysis of Multivariate Event Times
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
Maintained by Klaus K. Holst. Last updated 19 hours ago.
multivariate-time-to-eventsurvival-analysistime-to-eventfortranopenblascpp
14 stars 13.46 score 236 scripts 42 dependentsironholds
urltools:Vectorised Tools for URL Handling and Parsing
A toolkit for all URL-handling needs, including encoding and decoding, parsing, parameter extraction and modification. All functions are designed to be both fast and entirely vectorised. It is intended to be useful for people dealing with web-related datasets, such as server-side logs, although may be useful for other situations involving large sets of URLs.
Maintained by Os Keyes. Last updated 4 years ago.
131 stars 13.43 score 968 scripts 264 dependentsropensci
tokenizers:Fast, Consistent Tokenization of Natural Language Text
Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words. The tokenizers have a consistent interface, and the package is built on the 'stringi' and 'Rcpp' packages for fast yet correct tokenization in 'UTF-8'.
Maintained by Thomas Charlon. Last updated 1 years ago.
nlppeer-reviewedtext-miningtokenizercpp
186 stars 13.33 score 1.1k scripts 81 dependentschjackson
flexsurv:Flexible Parametric Survival and Multi-State Models
Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.
Maintained by Christopher Jackson. Last updated 2 months ago.
57 stars 13.31 score 632 scripts 43 dependentsropensci
hunspell:High-Performance Stemmer, Tokenizer, and Spell Checker
Low level spell checker and morphological analyzer based on the famous 'hunspell' library <https://hunspell.github.io>. The package can analyze or check individual words as well as parse text, latex, html or xml documents. For a more user-friendly interface use the 'spelling' package which builds on this package to automate checking of files, documentation and vignettes in all common formats.
Maintained by Jeroen Ooms. Last updated 7 days ago.
hunspellspell-checkspellcheckerstemmertokenizercpp
112 stars 13.23 score 422 scripts 30 dependentsbioc
dada2:Accurate, high-resolution sample inference from amplicon sequencing data
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
Maintained by Benjamin Callahan. Last updated 5 months ago.
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioconductorbioinformaticsmetabarcodingtaxonomycpp
487 stars 13.17 score 3.0k scripts 4 dependentsfstpackage
fst:Lightning Fast Serialization of Data Frames
Multithreaded serialization of compressed data frames using the 'fst' format. The 'fst' format allows for full random access of stored data and a wide range of compression settings using the LZ4 and ZSTD compressors.
Maintained by Mark Klik. Last updated 6 months ago.
compressiondata-framedata-storagecpp
624 stars 13.16 score 1.9k scripts 56 dependentsdaqana
dqrng:Fast Pseudo Random Number Generators
Several fast random number generators are provided as C++ header only libraries: The PCG family by O'Neill (2014 <https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf>) as well as the Xoroshiro / Xoshiro family by Blackman and Vigna (2021 <doi:10.1145/3460772>). In addition fast functions for generating random numbers according to a uniform, normal and exponential distribution are included. The latter two use the Ziggurat algorithm originally proposed by Marsaglia and Tsang (2000, <doi:10.18637/jss.v005.i08>). The fast sampling methods support unweighted sampling both with and without replacement. These functions are exported to R and as a C++ interface and are enabled for use with the default 64 bit generator from the PCG family, Xoroshiro128+/++/** and Xoshiro256+/++/** as well as the 64 bit version of the 20 rounds Threefry engine (Salmon et al., 2011, <doi:10.1145/2063384.2063405>) as provided by the package 'sitmo'.
Maintained by Ralf Stubner. Last updated 7 months ago.
randomrandom-distributionsrandom-generationrandom-samplingrngcpp
42 stars 13.12 score 188 scripts 183 dependentsbioc
pcaMethods:A collection of PCA methods
Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.
Maintained by Henning Redestig. Last updated 5 months ago.
49 stars 13.10 score 538 scripts 73 dependentsropensci
pdftools:Text Extraction, Rendering and Converting of PDF Documents
Utilities based on 'libpoppler' <https://poppler.freedesktop.org> for extracting text, fonts, attachments and metadata from a PDF file. Also supports high quality rendering of PDF documents into PNG, JPEG, TIFF format, or into raw bitmap vectors for further processing in R.
Maintained by Jeroen Ooms. Last updated 26 days ago.
pdf-filespdf-formatpdftoolspopplerpoppler-librarytext-extractioncpp
529 stars 13.10 score 3.3k scripts 47 dependentsbioc
scran:Methods for Single-Cell RNA-Seq Data Analysis
Implements miscellaneous functions for interpretation of single-cell RNA-seq data. Methods are provided for assignment of cell cycle phase, detection of highly variable and significantly correlated genes, identification of marker genes, and other common tasks in routine single-cell analysis workflows.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologynormalizationsequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecellclusteringbioconductor-packagehuman-cell-atlassingle-cell-rna-seqopenblascpp
41 stars 13.05 score 7.6k scripts 37 dependentsbiodiverse
unmarked:Models for Data from Unmarked Animals
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 10 days ago.
4 stars 13.02 score 652 scripts 12 dependentsr-forge
tm:Text Mining Package
A framework for text mining applications within R.
Maintained by Kurt Hornik. Last updated 1 months ago.
13.00 score 14k scripts 100 dependentsdavidcsterratt
geometry:Mesh Generation and Surface Tessellation
Makes the 'Qhull' library <http://www.qhull.org> available in R, in a similar manner as in Octave and MATLAB. Qhull computes convex hulls, Delaunay triangulations, halfspace intersections about a point, Voronoi diagrams, furthest-site Delaunay triangulations, and furthest-site Voronoi diagrams. It runs in 2D, 3D, 4D, and higher dimensions. It implements the Quickhull algorithm for computing the convex hull. Qhull does not support constrained Delaunay triangulations, or mesh generation of non-convex objects, but the package does include some R functions that allow for this.
Maintained by David C. Sterratt. Last updated 2 months ago.
16 stars 12.98 score 776 scripts 139 dependentsnimble-dev
nimble:MCMC, Particle Filtering, and Programmable Hierarchical Modeling
A system for writing hierarchical statistical models largely compatible with 'BUGS' and 'JAGS', writing nimbleFunctions to operate models and do basic R-style math, and compiling both models and nimbleFunctions via custom-generated C++. 'NIMBLE' includes default methods for MCMC, Laplace Approximation, Monte Carlo Expectation Maximization, and some other tools. The nimbleFunction system makes it easy to do things like implement new MCMC samplers from R, customize the assignment of samplers to different parts of a model from R, and compile the new samplers automatically via C++ alongside the samplers 'NIMBLE' provides. 'NIMBLE' extends the 'BUGS'/'JAGS' language by making it extensible: New distributions and functions can be added, including as calls to external compiled code. Although most people think of MCMC as the main goal of the 'BUGS'/'JAGS' language for writing models, one can use 'NIMBLE' for writing arbitrary other kinds of model-generic algorithms as well. A full User Manual is available at <https://r-nimble.org>.
Maintained by Christopher Paciorek. Last updated 17 days ago.
bayesian-inferencebayesian-methodshierarchical-modelsmcmcprobabilistic-programmingopenblascpp
169 stars 12.97 score 2.6k scripts 19 dependentsr-spatial
lwgeom:Bindings to Selected 'liblwgeom' Functions for Simple Features
Access to selected functions found in 'liblwgeom' <https://github.com/postgis/postgis/tree/master/liblwgeom>, the light-weight geometry library used by 'PostGIS' <http://postgis.net/>.
Maintained by Edzer Pebesma. Last updated 2 months ago.
61 stars 12.95 score 1.7k scripts 66 dependentsopenair-project
openair:Tools for the Analysis of Air Pollution Data
Tools to analyse, interpret and understand air pollution data. Data are typically regular time series and air quality measurement, meteorological data and dispersion model output can be analysed. The package is described in Carslaw and Ropkins (2012, <doi:10.1016/j.envsoft.2011.09.008>) and subsequent papers.
Maintained by David Carslaw. Last updated 1 days ago.
air-qualityair-quality-datameteorologyopenaircpp
316 stars 12.94 score 1.2k scripts 12 dependentsthomasp85
tweenr:Interpolate Data for Smooth Animations
In order to create smooth animation between states of data, tweening is necessary. This package provides a range of functions for creating tweened data that can be used as basis for animation. Furthermore it adds a number of vectorized interpolaters for common R data types such as numeric, date and colour.
Maintained by Thomas Lin Pedersen. Last updated 1 years ago.
animationplottingtransitiontweeningcpp
399 stars 12.93 score 440 scripts 324 dependentscvxgrp
CVXR:Disciplined Convex Optimization
An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.
Maintained by Anqi Fu. Last updated 5 months ago.
207 stars 12.89 score 768 scripts 51 dependentspaleolimbot
wk:Lightweight Well-Known Geometry Parsing
Provides a minimal R and C++ API for parsing well-known binary and well-known text representation of geometries to and from R-native formats. Well-known binary is compact and fast to parse; well-known text is human-readable and is useful for writing tests. These formats are useful in R only if the information they contain can be accessed in R, for which high-performance functions are provided here.
Maintained by Dewey Dunnington. Last updated 6 months ago.
47 stars 12.85 score 89 scripts 1.2k dependentsbioc
SingleR:Reference-Based Single-Cell RNA-Seq Annotation
Performs unbiased cell type recognition from single-cell RNA sequencing data, by leveraging reference transcriptomic datasets of pure cell types to infer the cell of origin of each single cell independently.
Maintained by Aaron Lun. Last updated 1 months ago.
softwaresinglecellgeneexpressiontranscriptomicsclassificationclusteringannotationbioconductorsinglercpp
184 stars 12.83 score 2.1k scripts 2 dependentstkonopka
umap:Uniform Manifold Approximation and Projection
Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).
Maintained by Tomasz Konopka. Last updated 11 months ago.
dimensionality-reductionumapcpp
132 stars 12.82 score 3.6k scripts 45 dependentsdavidgohel
gdtools:Utilities for Graphical Rendering and Fonts Management
Tools are provided to compute metrics of formatted strings and to check the availability of a font. Another set of functions is provided to support the collection of fonts from 'Google Fonts' in a cache. Their use is simple within 'R Markdown' documents and 'shiny' applications but also with graphic productions generated with the 'ggiraph', 'ragg' and 'svglite' packages or with tabular productions from the 'flextable' package.
Maintained by David Gohel. Last updated 4 days ago.
26 stars 12.80 score 234 scripts 152 dependentsspedygiorgio
markovchain:Easy Handling Discrete Time Markov Chains
Functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided. See Spedicato (2017) <doi:10.32614/RJ-2017-036>. Some functions for continuous times Markov chains depend on the suggested ctmcd package.
Maintained by Giorgio Alfredo Spedicato. Last updated 5 months ago.
ctmcdtmcmarkov-chainmarkov-modelr-programmingrcppopenblascpp
104 stars 12.78 score 712 scripts 4 dependentsbioc
mzR:parser for netCDF, mzXML and mzML and mzIdentML files (mass spectrometry data)
mzR provides a unified API to the common file formats and parsers available for mass spectrometry data. It comes with a subset of the proteowizard library for mzXML, mzML and mzIdentML. The netCDF reading code has previously been used in XCMS.
Maintained by Steffen Neumann. Last updated 2 months ago.
immunooncologyinfrastructuredataimportproteomicsmetabolomicsmassspectrometryzlibcpp
45 stars 12.77 score 204 scripts 44 dependentsbioc
EBImage:Image processing and analysis toolbox for R
EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.
Maintained by Andrzej Oleś. Last updated 5 months ago.
visualizationbioinformaticsimage-analysisimage-processingcpp
71 stars 12.77 score 1.5k scripts 33 dependentsbioc
MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Maintained by Laurent Gatto. Last updated 15 days ago.
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
131 stars 12.76 score 772 scripts 36 dependentsropensci
readODS:Read and Write ODS Files
Read ODS (OpenDocument Spreadsheet) into R as data frame. Also support writing data frame into ODS file.
Maintained by Chung-hong Chan. Last updated 3 months ago.
55 stars 12.74 score 808 scripts 26 dependentscovaruber
sommer:Solving Mixed Model Equations in R
Structural multivariate-univariate linear mixed model solver for estimation of multiple random effects with unknown variance-covariance structures (e.g., heterogeneous and unstructured) and known covariance among levels of random effects (e.g., pedigree and genomic relationship matrices) (Covarrubias-Pazaran, 2016 <doi:10.1371/journal.pone.0156744>; Maier et al., 2015 <doi:10.1016/j.ajhg.2014.12.006>; Jensen et al., 1997). REML estimates can be obtained using the Direct-Inversion Newton-Raphson and Direct-Inversion Average Information algorithms for the problems r x r (r being the number of records) or using the Henderson-based average information algorithm for the problem c x c (c being the number of coefficients to estimate). Spatial models can also be fitted using the two-dimensional spline functionality available.
Maintained by Giovanny Covarrubias-Pazaran. Last updated 2 days ago.
average-informationmixed-modelsrcpparmadilloopenblascppopenmp
44 stars 12.63 score 300 scripts 10 dependentsbstewart
stm:Estimation of the Structural Topic Model
The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions. Methods developed in Roberts et. al. (2014) <doi:10.1111/ajps.12103> and Roberts et. al. (2016) <doi:10.1080/01621459.2016.1141684>. Vignette is Roberts et. al. (2019) <doi:10.18637/jss.v091.i02>.
Maintained by Brandon Stewart. Last updated 1 years ago.
404 stars 12.63 score 1.6k scripts 6 dependentsbioc
SNPRelate:Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data
Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.
Maintained by Xiuwen Zheng. Last updated 5 months ago.
infrastructuregeneticsstatisticalmethodprincipalcomponentbioinformaticsgds-formatpcasimdsnpopenblascpp
105 stars 12.57 score 1.6k scripts 19 dependentsr-dbi
bigrquery:An Interface to Google's 'BigQuery' 'API'
Easily talk to Google's 'BigQuery' database from R.
Maintained by Hadley Wickham. Last updated 1 months ago.
520 stars 12.47 score 1.8k scripts 4 dependentsr-spatialecology
landscapemetrics:Landscape Metrics for Categorical Map Patterns
Calculates landscape metrics for categorical landscape patterns in a tidy workflow. 'landscapemetrics' reimplements the most common metrics from 'FRAGSTATS' (<https://www.fragstats.org/>) and new ones from the current literature on landscape metrics. This package supports 'terra' SpatRaster objects as input arguments. It further provides utility functions to visualize patches, select metrics and building blocks to develop new metrics.
Maintained by Maximilian H.K. Hesselbarth. Last updated 2 months ago.
landscape-ecologylandscape-metricsrasterspatialcpp
240 stars 12.47 score 584 scripts 4 dependentsdrostlab
philentropy:Similarity and Distance Quantification Between Probability Functions
Computes 46 optimized distance and similarity measures for comparing probability functions (Drost (2018) <doi:10.21105/joss.00765>). These comparisons between probability functions have their foundations in a broad range of scientific disciplines from mathematics to ecology. The aim of this package is to provide a core framework for clustering, classification, statistical inference, goodness-of-fit, non-parametric statistics, information theory, and machine learning tasks that are based on comparing univariate or multivariate probability functions.
Maintained by Hajk-Georg Drost. Last updated 4 months ago.
distance-measuresdistance-quantificationinformation-theoryjensen-shannon-divergenceparametric-distributionssimilarity-measuresstatisticscpp
137 stars 12.44 score 484 scripts 24 dependentsyixuan
RSpectra:Solvers for Large-Scale Eigenvalue and SVD Problems
R interface to the 'Spectra' library <https://spectralib.org/> for large-scale eigenvalue and SVD problems. It is typically used to compute a few eigenvalues/vectors of an n by n matrix, e.g., the k largest eigenvalues, which is usually more efficient than eigen() if k << n. This package provides the 'eigs()' function that does the similar job as in 'Matlab', 'Octave', 'Python SciPy' and 'Julia'. It also provides the 'svds()' function to calculate the largest k singular values and corresponding singular vectors of a real matrix. The matrix to be computed on can be dense, sparse, or in the form of an operator defined by the user.
Maintained by Yixuan Qiu. Last updated 8 months ago.
eigenvaluesspectrasvdopenblascpp
81 stars 12.40 score 394 scripts 433 dependentsschochastics
graphlayouts:Additional Layout Algorithms for Network Visualizations
Several new layout algorithms to visualize networks are provided which are not part of 'igraph'. Most are based on the concept of stress majorization by Gansner et al. (2004) <doi:10.1007/978-3-540-31843-9_25>. Some more specific algorithms allow the user to emphasize hidden group structures in networks or focus on specific nodes.
Maintained by David Schoch. Last updated 2 months ago.
ggraphgraph-algorithmsnetwork-analysisnetwork-visualizationcpp
277 stars 12.38 score 322 scripts 115 dependentsasardaes
dtwclust:Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance
Time series clustering along with optimized techniques related to the Dynamic Time Warping distance and its corresponding lower bounds. Implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.
Maintained by Alexis Sarda. Last updated 8 months ago.
clusteringdtwtime-seriesopenblascpp
262 stars 12.35 score 406 scripts 14 dependentsjuliainterop
JuliaCall:Seamless Integration Between R and 'Julia'
Provides an R interface to 'Julia', which is a high-level, high-performance dynamic programming language for numerical computing, see <https://julialang.org/> for more information. It provides a high-level interface as well as a low-level interface. Using the high level interface, you could call any 'Julia' function just like any R function with automatic type conversion. Using the low level interface, you could deal with C-level SEXP directly while enjoying the convenience of using a high-level programming language like 'Julia'.
Maintained by Changcheng Li. Last updated 4 months ago.
270 stars 12.33 score 380 scripts 8 dependentseddelbuettel
RcppTOML:'Rcpp' Bindings to Parser for "Tom's Obvious Markup Language"
The configuration format defined by 'TOML' (which expands to "Tom's Obvious Markup Language") specifies an excellent format (described at <https://toml.io/en/>) suitable for both human editing as well as the common uses of a machine-readable format. This package uses 'Rcpp' to connect to the 'toml++' parser written by Mark Gillard to R.
Maintained by Dirk Eddelbuettel. Last updated 21 days ago.
c-plus-plus-11tomltoml-parsertoml-parsingcpp
36 stars 12.32 score 124 scripts 433 dependentsjefferislab
RANN:Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric
Finds the k nearest neighbours for every point in a given dataset in O(N log N) time using Arya and Mount's ANN library (v1.1.3). There is support for approximate as well as exact searches, fixed radius searches and 'bd' as well as 'kd' trees. The distance is computed using the L2 (Euclidean) metric. Please see package 'RANN.L1' for the same functionality using the L1 (Manhattan, taxicab) metric.
Maintained by Gregory Jefferis. Last updated 7 months ago.
ann-librarynearest-neighborsnearest-neighbourscpp
58 stars 12.31 score 1.3k scripts 193 dependentsalexkz
kernlab:Kernel-Based Machine Learning Lab
Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods 'kernlab' includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.
Maintained by Alexandros Karatzoglou. Last updated 8 months ago.
21 stars 12.26 score 7.8k scripts 487 dependentsbioc
bsseq:Analyze, manage and store whole-genome methylation data
A collection of tools for analyzing and visualizing whole-genome methylation data from sequencing. This includes whole-genome bisulfite sequencing and Oxford nanopore data.
Maintained by Kasper Daniel Hansen. Last updated 3 months ago.
37 stars 12.26 score 676 scripts 15 dependentsalexiosg
rugarch:Univariate GARCH Models
ARFIMA, in-mean, external regressors and various GARCH flavors, with methods for fit, forecast, simulation, inference and plotting.
Maintained by Alexios Galanos. Last updated 3 months ago.
26 stars 12.25 score 1.3k scripts 16 dependentsmarkmfredrickson
optmatch:Functions for Optimal Matching
Distance based bipartite matching using minimum cost flow, oriented to matching of treatment and control groups in observational studies ('Hansen' and 'Klopfer' 2006 <doi:10.1198/106186006X137047>). Routines are provided to generate distances from generalised linear models (propensity score matching), formulas giving variables on which to limit matched distances, stratified or exact matching directives, or calipers, alone or in combination.
Maintained by Josh Errickson. Last updated 4 months ago.
47 stars 12.22 score 588 scripts 5 dependentsr-dbi
RMariaDB:Database Interface and MariaDB Driver
Implements a DBI-compliant interface to MariaDB (<https://mariadb.org/>) and MySQL (<https://www.mysql.com/>) databases.
Maintained by Kirill Müller. Last updated 1 months ago.
133 stars 12.20 score 792 scripts 10 dependentssteffenmoritz
imputeTS:Time Series Missing Value Imputation
Imputation (replacement) of missing values in univariate time series. Offers several imputation functions and missing data plots. Available imputation algorithms include: 'Mean', 'LOCF', 'Interpolation', 'Moving Average', 'Seasonal Decomposition', 'Kalman Smoothing on Structural Time Series models', 'Kalman Smoothing on ARIMA models'. Published in Moritz and Bartz-Beielstein (2017) <doi:10.32614/RJ-2017-009>.
Maintained by Steffen Moritz. Last updated 3 years ago.
data-visualizationimputationimputation-algorithmimputetsmissing-datatime-seriescpp
162 stars 12.18 score 1.9k scripts 27 dependentsstuart-lab
Signac:Analysis of Single-Cell Chromatin Data
A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.
Maintained by Tim Stuart. Last updated 7 months ago.
atacbioinformaticssingle-cellzlibcpp
355 stars 12.18 score 3.7k scripts 1 dependentsbioc
glmGamPoi:Fit a Gamma-Poisson Generalized Linear Model
Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.
Maintained by Constantin Ahlmann-Eltze. Last updated 12 days ago.
regressionrnaseqsoftwaresinglecellgamma-poissonglmnegative-binomial-regressionon-diskopenblascpp
111 stars 12.16 score 1.0k scripts 4 dependentsr-lib
lobstr:Visualize R Data Structures with Trees
A set of tools for inspecting and understanding R data structures inspired by str(). Includes ast() for visualizing abstract syntax trees, ref() for showing shared references, cst() for showing call stack trees, and obj_size() for computing object sizes.
Maintained by Hadley Wickham. Last updated 1 years ago.
305 stars 12.15 score 732 scripts 95 dependentsopenpharma
mmrm:Mixed Models for Repeated Measures
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.
Maintained by Daniel Sabanes Bove. Last updated 22 days ago.
138 stars 12.15 score 113 scripts 4 dependentsrstudio
shinytest2:Testing for Shiny Applications
Automated unit testing of Shiny applications through a headless 'Chromium' browser.
Maintained by Barret Schloerke. Last updated 3 days ago.
108 stars 12.13 score 704 scripts 1 dependentsvspinu
timechange:Efficient Manipulation of Date-Times
Efficient routines for manipulation of date-time objects while accounting for time-zones and daylight saving times. The package includes utilities for updating of date-time components (year, month, day etc.), modification of time-zones, rounding of date-times, period addition and subtraction etc. Parts of the 'CCTZ' source code, released under the Apache 2.0 License, are included in this package. See <https://github.com/google/cctz> for more details.
Maintained by Vitalie Spinu. Last updated 1 years ago.
ceilingdate-timeperiodroundingtimetime-zonesupdatecpp
30 stars 12.12 score 68 scripts 1.9k dependentsbioc
SeqArray:Data management of large-scale whole-genome sequence variant calls using GDS files
Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.
Maintained by Xiuwen Zheng. Last updated 6 days ago.
infrastructuredatarepresentationsequencinggeneticsbioinformaticsgds-formatsnpsnvweswgscpp
45 stars 12.11 score 1.1k scripts 9 dependentsbioc
BiocSingular:Singular Value Decomposition for Bioconductor Packages
Implements exact and approximate methods for singular value decomposition and principal components analysis, in a framework that allows them to be easily switched within Bioconductor packages or workflows. Where possible, parallelization is achieved using the BiocParallel framework.
Maintained by Aaron Lun. Last updated 5 months ago.
softwaredimensionreductionprincipalcomponentbioconductor-packagehuman-cell-atlassingular-value-decompositioncpp
7 stars 12.10 score 1.2k scripts 103 dependentsstephens999
ashr:Methods for Adaptive Shrinkage, using Empirical Bayes
The R package 'ashr' implements an Empirical Bayes approach for large-scale hypothesis testing and false discovery rate (FDR) estimation based on the methods proposed in M. Stephens, 2016, "False discovery rates: a new deal", <DOI:10.1093/biostatistics/kxw041>. These methods can be applied whenever two sets of summary statistics---estimated effects and standard errors---are available, just as 'qvalue' can be applied to previously computed p-values. Two main interfaces are provided: ash(), which is more user-friendly; and ash.workhorse(), which has more options and is geared toward advanced users. The ash() and ash.workhorse() also provides a flexible modeling interface that can accommodate a variety of likelihoods (e.g., normal, Poisson) and mixture priors (e.g., uniform, normal).
Maintained by Peter Carbonetto. Last updated 11 months ago.
82 stars 12.10 score 780 scripts 15 dependentsbioc
ShortRead:FASTQ input and manipulation
This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
dataimportsequencingqualitycontrolbioconductor-packagecore-packagezlibcpp
8 stars 12.08 score 1.8k scripts 49 dependentsjolars
eulerr:Area-Proportional Euler and Venn Diagrams with Ellipses
Generate area-proportional Euler diagrams using numerical optimization. An Euler diagram is a generalization of a Venn diagram, relaxing the criterion that all interactions need to be represented. Diagrams may be fit with ellipses and circles via a wide range of inputs and can be visualized in numerous ways.
Maintained by Johan Larsson. Last updated 1 years ago.
euler-diagramvenn-diagramopenblascpp
131 stars 12.08 score 1.2k scripts 5 dependentsbioc
sparseMatrixStats:Summary Statistics for Rows and Columns of Sparse Matrices
High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.
Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.
infrastructuresoftwaredatarepresentationcpp
54 stars 11.98 score 174 scripts 130 dependentseddelbuettel
RcppAnnoy:'Rcpp' Bindings for 'Annoy', a Library for Approximate Nearest Neighbors
'Annoy' is a small C++ library for Approximate Nearest Neighbors written for efficient memory usage as well an ability to load from / save to disk. This package provides an R interface by relying on the 'Rcpp' package, exposing the same interface as the original Python wrapper to 'Annoy'. See <https://github.com/spotify/annoy> for more on 'Annoy'. 'Annoy' is released under Version 2.0 of the Apache License. Also included is a small Windows port of 'mmap' which is released under the MIT license.
Maintained by Dirk Eddelbuettel. Last updated 21 days ago.
annoynearestnearest-neighborscpp
72 stars 11.97 score 57 scripts 147 dependentsexaexa
scattermore:Scatterplots with More Points
C-based conversion of large scatterplot data to rasters plus other operations such as data blurring or data alpha blending. Speeds up plotting of data with millions of points.
Maintained by Mirek Kratochvil. Last updated 1 years ago.
performanceplotscatterplotvisualizationcpp
244 stars 11.95 score 596 scripts 85 dependentshadley
pryr:Tools for Computing on the Language
Useful tools to pry back the covers of R and understand the language at a deeper level.
Maintained by Hadley Wickham. Last updated 1 years ago.
204 stars 11.93 score 1.9k scripts 57 dependentsrspatial
dismo:Species Distribution Modeling
Methods for species distribution modeling, that is, predicting the environmental similarity of any site to that of the locations of known occurrences of a species.
Maintained by Robert J. Hijmans. Last updated 4 months ago.
25 stars 11.88 score 2.8k scripts 21 dependentskaneplusplus
bigmemory:Manage Massive Matrices with Shared Memory and Memory-Mapped Files
Create, store, access, and manipulate massive matrices. Matrices are allocated to shared memory and may use memory-mapped files. Packages 'biganalytics', 'bigtabulate', 'synchronicity', and 'bigalgebra' provide advanced functionality.
Maintained by Michael J. Kane. Last updated 1 years ago.
127 stars 11.87 score 920 scripts 64 dependentsepiforecasts
EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.
Maintained by Sebastian Funk. Last updated 1 months ago.
backcalculationcovid-19gaussian-processesopen-sourcereproduction-numberstancpp
123 stars 11.86 score 210 scriptsjackstat
ModelMetrics:Rapid Calculation of Model Metrics
Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.
Maintained by Tyler Hunt. Last updated 4 years ago.
aucloglossmachine-learningmetricsmodel-evaluationmodel-metricscpp
29 stars 11.83 score 1.3k scripts 306 dependentsapache
nanoarrow:Interface to the 'nanoarrow' 'C' Library
Provides an 'R' interface to the 'nanoarrow' 'C' library and the 'Apache Arrow' application binary interface. Functions to import and export 'ArrowArray', 'ArrowSchema', and 'ArrowArrayStream' 'C' structures to and from 'R' objects are provided alongside helpers to facilitate zero-copy data transfer among 'R' bindings to libraries implementing the 'Arrow' 'C' data interface.
Maintained by Dewey Dunnington. Last updated 17 hours ago.
185 stars 11.83 score 37 scripts 27 dependentsbnosac
udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit
This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.
Maintained by Jan Wijffels. Last updated 2 years ago.
conlldependency-parserlemmatizationnatural-language-processingnlppos-taggingr-pkgrcpptext-miningtokenizerudpipecpp
215 stars 11.83 score 1.2k scripts 9 dependentsbioc
methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results
methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.
Maintained by Altuna Akalin. Last updated 29 days ago.
dnamethylationsequencingmethylseqgenome-biologymethylationstatistical-analysisvisualizationcurlbzip2xz-utilszlibcpp
220 stars 11.80 score 578 scripts 3 dependentsr-lib
archive:Multi-Format Archive and Compression Support
Bindings to 'libarchive' <http://www.libarchive.org> the Multi-format archive and compression library. Offers R connections and direct extraction for many archive formats including 'tar', 'ZIP', '7-zip', 'RAR', 'CAB' and compression formats including 'gzip', 'bzip2', 'compress', 'lzma' and 'xz'.
Maintained by Gábor Csárdi. Last updated 5 days ago.
compressionconnectionslibarchivecpp
144 stars 11.80 score 494 scripts 27 dependentstiledb-inc
tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays
The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.
Maintained by Isaiah Norton. Last updated 1 days ago.
arrayhdfss3storage-managertiledbcpp
108 stars 11.79 score 306 scripts 4 dependentskevinushey
sourcetools:Tools for Reading, Tokenizing and Parsing R Code
Tools for Reading, Tokenizing and Parsing R Code.
Maintained by Kevin Ushey. Last updated 2 years ago.
78 stars 11.77 score 32 scripts 1.8k dependentsr-forge
minqa:Derivative-Free Optimization Algorithms by Quadratic Approximation
Derivative-free optimization by quadratic approximation based on an interface to Fortran implementations by M. J. D. Powell.
Maintained by Katharine M. Mullen. Last updated 3 months ago.
1 stars 11.73 score 227 scripts 1.7k dependentsprioritizr
prioritizr:Systematic Conservation Prioritization in R
Systematic conservation prioritization using mixed integer linear programming (MILP). It provides a flexible interface for building and solving conservation planning problems. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. By using exact algorithm solvers, solutions can be generated that are guaranteed to be optimal (or within a pre-specified optimality gap). Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management actions or zones, meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. To solve large-scale or complex conservation planning problems, users should install the Gurobi optimization software (available from <https://www.gurobi.com/>) and the 'gurobi' R package (see Gurobi Installation Guide vignette for details). Users can also install the IBM CPLEX software (<https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer>) and the 'cplexAPI' R package (available at <https://github.com/cran/cplexAPI>). Additionally, the 'rcbc' R package (available at <https://github.com/dirkschumacher/rcbc>) can be used to generate solutions using the CBC optimization software (<https://github.com/coin-or/Cbc>). For further details, see Hanson et al. (2025) <doi:10.1111/cobi.14376>.
Maintained by Richard Schuster. Last updated 1 days ago.
biodiversityconservationconservation-planneroptimizationprioritizationsolverspatialcpp
124 stars 11.71 score 584 scripts 2 dependentssatijalab
SeuratObject:Data Structures for Single Cell Data
Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.
Maintained by Paul Hoffman. Last updated 2 years ago.
25 stars 11.69 score 1.2k scripts 88 dependentswilkelab
gridtext:Improved Text Rendering Support for 'Grid' Graphics
Provides support for rendering of formatted text using 'grid' graphics. Text can be formatted via a minimal subset of 'Markdown', 'HTML', and inline 'CSS' directives, and it can be rendered both with and without word wrap.
Maintained by Brenton M. Wiernik. Last updated 1 years ago.
96 stars 11.66 score 344 scripts 208 dependentstwolodzko
extraDistr:Additional Univariate and Multivariate Distributions
Density, distribution function, quantile function and random generation for a number of univariate and multivariate distributions. This package implements the following distributions: Bernoulli, beta-binomial, beta-negative binomial, beta prime, Bhattacharjee, Birnbaum-Saunders, bivariate normal, bivariate Poisson, categorical, Dirichlet, Dirichlet-multinomial, discrete gamma, discrete Laplace, discrete normal, discrete uniform, discrete Weibull, Frechet, gamma-Poisson, generalized extreme value, Gompertz, generalized Pareto, Gumbel, half-Cauchy, half-normal, half-t, Huber density, inverse chi-squared, inverse-gamma, Kumaraswamy, Laplace, location-scale t, logarithmic, Lomax, multivariate hypergeometric, multinomial, negative hypergeometric, non-standard beta, normal mixture, Poisson mixture, Pareto, power, reparametrized beta, Rayleigh, shifted Gompertz, Skellam, slash, triangular, truncated binomial, truncated normal, truncated Poisson, Tukey lambda, Wald, zero-inflated binomial, zero-inflated negative binomial, zero-inflated Poisson.
Maintained by Tymoteusz Wolodzko. Last updated 24 days ago.
c-plus-plusc-plus-plus-11distributionmultivariate-distributionsprobabilityrandom-generationrcppstatisticscpp
53 stars 11.60 score 1.5k scripts 107 dependentsluca-scr
GA:Genetic Algorithms
Flexible general-purpose toolbox implementing genetic algorithms (GAs) for stochastic optimisation. Binary, real-valued, and permutation representations are available to optimize a fitness function, i.e. a function provided by users depending on their objective function. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. Local search using general-purpose optimisation algorithms can be applied stochastically to exploit interesting regions. GAs can be run sequentially or in parallel, using an explicit master-slave parallelisation or a coarse-grain islands approach. For more details see Scrucca (2013) <doi:10.18637/jss.v053.i04> and Scrucca (2017) <doi:10.32614/RJ-2017-008>.
Maintained by Luca Scrucca. Last updated 7 months ago.
genetic-algorithmoptimisationcpp
93 stars 11.58 score 624 scripts 52 dependentsdeclaredesign
estimatr:Fast Estimators for Design-Based Inference
Fast procedures for small set of commonly-used, design-appropriate estimators with robust standard errors and confidence intervals. Includes estimators for linear regression, instrumental variables regression, difference-in-means, Horvitz-Thompson estimation, and regression improving precision of experimental estimates by interacting treatment with centered pre-treatment covariates introduced by Lin (2013) <doi:10.1214/12-AOAS583>.
Maintained by Graeme Blair. Last updated 2 months ago.
133 stars 11.58 score 1.7k scripts 11 dependentsedwinth
padr:Quickly Get Datetime Data Ready for Analysis
Transforms datetime data into a format ready for analysis. It offers two core functionalities; aggregating data to a higher level interval (thicken) and imputing records where observations were absent (pad).
Maintained by Edwin Thoen. Last updated 4 months ago.
132 stars 11.55 score 428 scripts 20 dependentsurbananalyst
dodgr:Distances on Directed Graphs
Distances on dual-weighted directed graphs using priority-queue shortest paths (Padgham (2019) <doi:10.32866/6945>). Weighted directed graphs have weights from A to B which may differ from those from B to A. Dual-weighted directed graphs have two sets of such weights. A canonical example is a street network to be used for routing in which routes are calculated by weighting distances according to the type of way and mode of transport, yet lengths of routes must be calculated from direct distances.
Maintained by Mark Padgham. Last updated 19 hours ago.
distanceopenstreetmaproutershortest-pathsstreet-networkscpp
129 stars 11.52 score 229 scripts 4 dependentstylermorganwall
rayshader:Create Maps and Visualize Data in 2D and 3D
Uses a combination of raytracing and multiple hill shading methods to produce 2D and 3D data visualizations and maps. Includes water detection and layering functions, programmable color palette generation, several built-in textures for hill shading, 2D and 3D plotting options, a built-in path tracer, 'Wavefront' OBJ file export, and the ability to save 3D visualizations to a 3D printable format.
Maintained by Tyler Morgan-Wall. Last updated 2 months ago.
2.1k stars 11.51 score 1.5k scripts 5 dependentswenjie2wang
splines2:Regression Spline Functions and Classes
Constructs basis functions of B-splines, M-splines, I-splines, convex splines (C-splines), periodic splines, natural cubic splines, generalized Bernstein polynomials, their derivatives, and integrals (except C-splines) by closed-form recursive formulas. It also contains a C++ head-only library integrated with Rcpp. See Wang and Yan (2021) <doi:10.6339/21-JDS1020> for details.
Maintained by Wenjie Wang. Last updated 24 days ago.
derivativeintegralrcppsplinesopenblascpp
43 stars 11.46 score 394 scripts 34 dependentsbioc
msa:Multiple Sequence Alignment
The 'msa' package provides a unified R/Bioconductor interface to the multiple sequence alignment algorithms ClustalW, ClustalOmega, and Muscle. All three algorithms are integrated in the package, therefore, they do not depend on any external software tools and are available for all major platforms. The multiple sequence alignment algorithms are complemented by a function for pretty-printing multiple sequence alignments using the LaTeX package TeXshade.
Maintained by Ulrich Bodenhofer. Last updated 1 months ago.
multiplesequencealignmentalignmentmultiplecomparisonsequencingcpp
17 stars 11.46 score 744 scripts 6 dependentsprivefl
bigsnpr:Analysis of Massive SNP Arrays
Easy-to-use, efficient, flexible and scalable tools for analyzing massive SNP arrays. Privé et al. (2018) <doi:10.1093/bioinformatics/bty185>.
Maintained by Florian Privé. Last updated 23 days ago.
big-databioinformaticsmemory-mapped-fileparallel-computingpolygenic-scorespopulation-structure-inferencesnp-datastatistical-methodsopenblaszlibcppopenmp
200 stars 11.44 score 1.5k scripts 3 dependentseddelbuettel
RProtoBuf:R Interface to the 'Protocol Buffers' 'API' (Version 2 or 3)
Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal 'RPC' protocols and file formats. Additional documentation is available in two included vignettes one of which corresponds to our 'JSS' paper (2016, <doi:10.18637/jss.v071.i02>. A sufficiently recent version of 'Protocol Buffers' library is required; currently version 3.3.0 from 2017 is the stated minimum.
Maintained by Dirk Eddelbuettel. Last updated 13 days ago.
c-plus-plusprotocol-buffersprotobufcpp
73 stars 11.44 score 126 scripts 21 dependentsbioc
destiny:Creates diffusion maps
Create and plot diffusion maps.
Maintained by Philipp Angerer. Last updated 4 months ago.
cellbiologycellbasedassaysclusteringsoftwarevisualizationdiffusion-mapsdimensionality-reductioncpp
82 stars 11.44 score 792 scripts 1 dependentsr-simmer
simmer:Discrete-Event Simulation for R
A process-oriented and trajectory-based Discrete-Event Simulation (DES) package for R. It is designed as a generic yet powerful framework. The architecture encloses a robust and fast simulation core written in 'C++' with automatic monitoring capabilities. It provides a rich and flexible R API that revolves around the concept of trajectory, a common path in the simulation model for entities of the same type. Documentation about 'simmer' is provided by several vignettes included in this package, via the paper by Ucar, Smeets & Azcorra (2019, <doi:10.18637/jss.v090.i02>), and the paper by Ucar, Hernández, Serrano & Azcorra (2018, <doi:10.1109/MCOM.2018.1700960>); see 'citation("simmer")' for details.
Maintained by Iñaki Ucar. Last updated 6 months ago.
223 stars 11.43 score 440 scripts 6 dependentssachaepskamp
qgraph:Graph Plotting Methods, Psychometric Data Visualization and Graphical Model Estimation
Fork of qgraph - Weighted network visualization and analysis, as well as Gaussian graphical model computation. See Epskamp et al. (2012) <doi:10.18637/jss.v048.i04>.
Maintained by Sacha Epskamp. Last updated 1 years ago.
69 stars 11.43 score 1.2k scripts 63 dependentsepimodel
EpiModel:Mathematical Modeling of Infectious Disease Dynamics
Tools for simulating mathematical models of infectious disease dynamics. Epidemic model classes include deterministic compartmental models, stochastic individual-contact models, and stochastic network models. Network models use the robust statistical methods of exponential-family random graph models (ERGMs) from the Statnet suite of software packages in R. Standard templates for epidemic modeling include SI, SIR, and SIS disease types. EpiModel features an API for extending these templates to address novel scientific research aims. Full methods for EpiModel are detailed in Jenness et al. (2018, <doi:10.18637/jss.v084.i08>).
Maintained by Samuel Jenness. Last updated 2 months ago.
agent-based-modelingepidemicsepidemiologyinfectious-diseasesnetwork-graphcpp
250 stars 11.43 score 315 scripts