Showing 200 of total 2319 results (show query)

r-lib

scales:Scale Functions for Visualization

Graphical scales map data to aesthetics, and provide methods for automatically determining breaks and labels for axes and legends.

Maintained by Thomas Lin Pedersen. Last updated 5 months ago.

ggplot2

41.0 match 419 stars 19.88 score 88k scripts 7.9k dependents

r-forge

car:Companion to Applied Regression

Functions to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage, 2019.

Maintained by John Fox. Last updated 5 months ago.

20.4 match 15.29 score 43k scripts 901 dependents

cran

Compositional:Compositional Data Analysis

Regression, classification, contour plots, hypothesis testing and fitting of distributions for compositional data are some of the functions included. We further include functions for percentages (or proportions). The standard textbook for such data is John Aitchison's (1986) "The statistical analysis of compositional data". Relevant papers include: a) Tsagris M.T., Preston S. and Wood A.T.A. (2011). "A data-based power transformation for compositional data". Fourth International International Workshop on Compositional Data Analysis. <doi:10.48550/arXiv.1106.1451> b) Tsagris M. (2014). "The k-NN algorithm for compositional data: a revised approach with and without zero values present". Journal of Data Science, 12(3): 519--534. <doi:10.6339/JDS.201407_12(3).0008>. c) Tsagris M. (2015). "A novel, divergence based, regression for compositional data". Proceedings of the 28th Panhellenic Statistics Conference, 15-18 April 2015, Athens, Greece, 430--444. <doi:10.48550/arXiv.1511.07600>. d) Tsagris M. (2015). "Regression analysis with compositional data containing zero values". Chilean Journal of Statistics, 6(2): 47--57. <https://soche.cl/chjs/volumes/06/02/Tsagris(2015).pdf>. e) Tsagris M., Preston S. and Wood A.T.A. (2016). "Improved supervised classification for compositional data using the alpha-transformation". Journal of Classification, 33(2): 243--261. <doi:10.1007/s00357-016-9207-5>. f) Tsagris M., Preston S. and Wood A.T.A. (2017). "Nonparametric hypothesis testing for equality of means on the simplex". Journal of Statistical Computation and Simulation, 87(2): 406--422. <doi:10.1080/00949655.2016.1216554>. g) Tsagris M. and Stewart C. (2018). "A Dirichlet regression model for compositional data with zeros". Lobachevskii Journal of Mathematics, 39(3): 398--412. <doi:10.1134/S1995080218030198>. h) Alenazi A. (2019). "Regression for compositional data with compositional data as predictor variables with or without zero values". Journal of Data Science, 17(1): 219--238. <doi:10.6339/JDS.201901_17(1).0010>. i) Tsagris M. and Stewart C. (2020). "A folded model for compositional data analysis". Australian and New Zealand Journal of Statistics, 62(2): 249--277. <doi:10.1111/anzs.12289>. j) Alenazi A.A. (2022). "f-divergence regression models for compositional data". Pakistan Journal of Statistics and Operation Research, 18(4): 867--882. <doi:10.18187/pjsor.v18i4.3969>. k) Tsagris M. and Stewart C. (2022). "A Review of Flexible Transformations for Modeling Compositional Data". In Advances and Innovations in Statistics and Data Science, pp. 225--234. <doi:10.1007/978-3-031-08329-7_10>. l) Alenazi A. (2023). "A review of compositional data analysis and recent advances". Communications in Statistics--Theory and Methods, 52(16): 5535--5567. <doi:10.1080/03610926.2021.2014890>. m) Tsagris M., Alenazi A. and Stewart C. (2023). "Flexible non-parametric regression models for compositional response data with zeros". Statistics and Computing, 33(106). <doi:10.1007/s11222-023-10277-5>. n) Tsagris. M. (2025). "Constrained least squares simplicial-simplicial regression". Statistics and Computing, 35(27). <doi:10.1007/s11222-024-10560-z>. o) Sevinc V. and Tsagris. M. (2024). "Energy Based Equality of Distributions Testing for Compositional Data". <doi:10.48550/arXiv.2412.05199>.

Maintained by Michail Tsagris. Last updated 2 months ago.

55.7 match 3 stars 3.64 score 4 dependents

afialkowski

SimMultiCorrData:Simulation of Correlated Data with Multiple Variable Types

Generate continuous (normal or non-normal), binary, ordinal, and count (Poisson or Negative Binomial) variables with a specified correlation matrix. It can also produce a single continuous variable. This package can be used to simulate data sets that mimic real-world situations (i.e. clinical or genetic data sets, plasmodes). All variables are generated from standard normal variables with an imposed intermediate correlation matrix. Continuous variables are simulated by specifying mean, variance, skewness, standardized kurtosis, and fifth and sixth standardized cumulants using either Fleishman's third-order (<DOI:10.1007/BF02293811>) or Headrick's fifth-order (<DOI:10.1016/S0167-9473(02)00072-5>) polynomial transformation. Binary and ordinal variables are simulated using a modification of the ordsample() function from 'GenOrd'. Count variables are simulated using the inverse cdf method. There are two simulation pathways which differ primarily according to the calculation of the intermediate correlation matrix. In Correlation Method 1, the intercorrelations involving count variables are determined using a simulation based, logarithmic correlation correction (adapting Yahav and Shmueli's 2012 method, <DOI:10.1002/asmb.901>). In Correlation Method 2, the count variables are treated as ordinal (adapting Barbiero and Ferrari's 2015 modification of GenOrd, <DOI:10.1002/asmb.2072>). There is an optional error loop that corrects the final correlation matrix to be within a user-specified precision value of the target matrix. The package also includes functions to calculate standardized cumulants for theoretical distributions or from real data sets, check if a target correlation matrix is within the possible correlation bounds (given the distributions of the simulated variables), summarize results (numerically or graphically), to verify valid power method pdfs, and to calculate lower standardized kurtosis bounds.

Maintained by Allison Cynthia Fialkowski. Last updated 7 years ago.

25.9 match 12 stars 7.58 score 44 scripts 6 dependents

juba

obsplot:Create Charts with 'Observable Plot'

Creation of charts with the 'Observable Plot' 'JavaScript' library.

Maintained by Julien Barnier. Last updated 2 years ago.

27.7 match 60 stars 5.08 score 10 scripts

r-forge

tm:Text Mining Package

A framework for text mining applications within R.

Maintained by Kurt Hornik. Last updated 26 days ago.

cpp

9.8 match 12.96 score 14k scripts 101 dependents

briencj

asremlPlus:Augments 'ASReml-R' in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences

Assists in automating the selection of terms to include in mixed models when 'asreml' is used to fit the models. Procedures are available for choosing models that conform to the hierarchy or marginality principle, for fitting and choosing between two-dimensional spatial models using correlation, natural cubic smoothing spline and P-spline models. A history of the fitting of a sequence of models is kept in a data frame. Also used to compute functions and contrasts of, to investigate differences between and to plot predictions obtained using any model fitting function. The content falls into the following natural groupings: (i) Data, (ii) Model modification functions, (iii) Model selection and description functions, (iv) Model diagnostics and simulation functions, (v) Prediction production and presentation functions, (vi) Response transformation functions, (vii) Object manipulation functions, and (viii) Miscellaneous functions (for further details see 'asremlPlus-package' in help). The 'asreml' package provides a computationally efficient algorithm for fitting a wide range of linear mixed models using Residual Maximum Likelihood. It is a commercial package and a license for it can be purchased from 'VSNi' <https://vsni.co.uk/> as 'asreml-R', who will supply a zip file for local installation/updating (see <https://asreml.kb.vsni.co.uk/>). It is not needed for functions that are methods for 'alldiffs' and 'data.frame' objects. The package 'asremPlus' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 28 days ago.

asremlmixed-models

12.3 match 19 stars 9.34 score 200 scripts

tidyverse

dplyr:A Grammar of Data Manipulation

A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

Maintained by Hadley Wickham. Last updated 13 days ago.

data-manipulationgrammarcpp

4.3 match 4.8k stars 24.68 score 659k scripts 7.8k dependents

fberding

aifeducation:Artificial Intelligence for Education

In social and educational settings, the use of Artificial Intelligence (AI) is a challenging task. Relevant data is often only available in handwritten forms, or the use of data is restricted by privacy policies. This often leads to small data sets. Furthermore, in the educational and social sciences, data is often unbalanced in terms of frequencies. To support educators as well as educational and social researchers in using the potentials of AI for their work, this package provides a unified interface for neural nets in 'PyTorch' to deal with natural language problems. In addition, the package ships with a shiny app, providing a graphical user interface. This allows the usage of AI for people without skills in writing python/R scripts. The tools integrate existing mathematical and statistical methods for dealing with small data sets via pseudo-labeling (e.g. Cascante-Bonilla et al. (2020) <doi:10.48550/arXiv.2001.06001>) and imbalanced data via the creation of synthetic cases (e.g. Bunkhumpornpat et al. (2012) <doi:10.1007/s10489-011-0287-y>). Performance evaluation of AI is connected to measures from content analysis which educational and social researchers are generally more familiar with (e.g. Berding & Pargmann (2022) <doi:10.30819/5581>, Gwet (2014) <ISBN:978-0-9708062-8-4>, Krippendorff (2019) <doi:10.4135/9781071878781>). Estimation of energy consumption and CO2 emissions during model training is done with the 'python' library 'codecarbon'. Finally, all objects created with this package allow to share trained AI models with other people.

Maintained by Berding Florian. Last updated 1 months ago.

cpp

21.5 match 4.48 score 8 scripts

beckerbenj

eatGADS:Data Management of Large Hierarchical Data

Import 'SPSS' data, handle and change 'SPSS' meta data, store and access large hierarchical data in 'SQLite' data bases.

Maintained by Benjamin Becker. Last updated 23 days ago.

11.1 match 1 stars 7.36 score 34 scripts 1 dependents

usdaforestservice

gdalraster:Bindings to the 'Geospatial Data Abstraction Library' Raster API

Interface to the Raster API of the 'Geospatial Data Abstraction Library' ('GDAL', <https://gdal.org>). Bindings are implemented in an exposed C++ class encapsulating a 'GDALDataset' and its raster band objects, along with several stand-alone functions. These support manual creation of uninitialized datasets, creation from existing raster as template, read/set dataset parameters, low level I/O, color tables, raster attribute tables, virtual raster (VRT), and 'gdalwarp' wrapper for reprojection and mosaicing. Includes 'GDAL' algorithms ('dem_proc()', 'polygonize()', 'rasterize()', etc.), and functions for coordinate transformation and spatial reference systems. Calling signatures resemble the native C, C++ and Python APIs provided by the 'GDAL' project. Includes raster 'calc()' to evaluate a given R expression on a layer or stack of layers, with pixel x/y available as variables in the expression; and raster 'combine()' to identify and count unique pixel combinations across multiple input layers, with optional output of the pixel-level combination IDs. Provides raster display using base 'graphics'. Bindings to a subset of the 'OGR' API are also included for managing vector data sources. Bindings to a subset of the Virtual Systems Interface ('VSI') are also included to support operations on 'GDAL' virtual file systems. These are general utility functions that abstract file system operations on URLs, cloud storage services, 'Zip'/'GZip'/'7z'/'RAR' archives, and in-memory files. 'gdalraster' may be useful in applications that need scalable, low-level I/O, or prefer a direct 'GDAL' API.

Maintained by Chris Toney. Last updated 4 hours ago.

gdalgeospatialrastervectorcpp

8.1 match 42 stars 9.52 score 32 scripts 3 dependents

tidyverse

purrr:Functional Programming Tools

A complete and consistent functional programming toolkit for R.

Maintained by Hadley Wickham. Last updated 1 months ago.

functional-programming

3.3 match 1.3k stars 22.12 score 59k scripts 6.9k dependents

rohelab

invertiforms:Invertible Transforms for Matrices

Provides composable invertible transforms for (sparse) matrices.

Maintained by Alex Hayes. Last updated 2 years ago.

22.2 match 1 stars 3.18 score 4 scripts 1 dependents

cran

binhf:Haar-Fisz Functions for Binomial Data

Binomial Haar-Fisz transforms for Gaussianization as in Nunes and Nason (2009).

Maintained by Matt Nunes. Last updated 7 years ago.

16.1 match 3.85 score 3 dependents

venelin

PCMBase:Simulation and Likelihood Calculation of Phylogenetic Comparative Models

Phylogenetic comparative methods represent models of continuous trait data associated with the tips of a phylogenetic tree. Examples of such models are Gaussian continuous time branching stochastic processes such as Brownian motion (BM) and Ornstein-Uhlenbeck (OU) processes, which regard the data at the tips of the tree as an observed (final) state of a Markov process starting from an initial state at the root and evolving along the branches of the tree. The PCMBase R package provides a general framework for manipulating such models. This framework consists of an application programming interface for specifying data and model parameters, and efficient algorithms for simulating trait evolution under a model and calculating the likelihood of model parameters for an assumed model and trait data. The package implements a growing collection of models, which currently includes BM, OU, BM/OU with jumps, two-speed OU as well as mixed Gaussian models, in which different types of the above models can be associated with different branches of the tree. The PCMBase package is limited to trait-simulation and likelihood calculation of (mixed) Gaussian phylogenetic models. The PCMFit package provides functionality for inference of these models to tree and trait data. The package web-site <https://venelin.github.io/PCMBase/> provides access to the documentation and other resources.

Maintained by Venelin Mitov. Last updated 10 months ago.

8.0 match 6 stars 7.56 score 85 scripts 3 dependents