Showing 200 of total 2319 results (show query)
sparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 10 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
100.9 match 959 stars 15.16 score 4.0k scripts 21 dependentsr-lib
scales:Scale Functions for Visualization
Graphical scales map data to aesthetics, and provide methods for automatically determining breaks and labels for axes and legends.
Maintained by Thomas Lin Pedersen. Last updated 5 months ago.
41.0 match 419 stars 19.88 score 88k scripts 7.9k dependentssebkrantz
collapse:Advanced and Fast Data Transformation
A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.
Maintained by Sebastian Krantz. Last updated 6 days ago.
data-aggregationdata-analysisdata-manipulationdata-processingdata-sciencedata-transformationeconometricshigh-performancepanel-datascientific-computingstatisticstime-seriesweightedweightscppopenmp
33.6 match 672 stars 16.63 score 708 scripts 97 dependentsdavidgohel
flextable:Functions for Tabular Reporting
Use a grammar for creating and customizing pretty tables. The following formats are supported: 'HTML', 'PDF', 'RTF', 'Microsoft Word', 'Microsoft PowerPoint' and R 'Grid Graphics'. 'R Markdown', 'Quarto' and the package 'officer' can be used to produce the result files. The syntax is the same for the user regardless of the type of output to be produced. A set of functions allows the creation, definition of cell arrangement, addition of headers or footers, formatting and definition of cell content with text and or images. The package also offers a set of high-level functions that allow tabular reporting of statistical models and the creation of complex cross tabulations.
Maintained by David Gohel. Last updated 1 months ago.
docxhtml5ms-office-documentsrmarkdowntable
27.8 match 583 stars 17.04 score 7.3k scripts 119 dependentstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 6 days ago.
24.2 match 584 stars 18.71 score 7.2k scripts 380 dependentsinsightsengineering
teal.transform:Functions for Extracting and Merging Data in the 'teal' Framework
A standardized user interface for column selection, that facilitates dataset merging in 'teal' framework.
Maintained by Dawid Kaledkowski. Last updated 1 months ago.
51.6 match 3 stars 8.39 score 9 scripts 4 dependentscran
wavethresh:Wavelets Statistics and Transforms
Performs 1, 2 and 3D real and complex-valued wavelet transforms, nondecimated transforms, wavelet packet transforms, nondecimated wavelet packet transforms, multiple wavelet transforms, complex-valued wavelet transforms, wavelet shrinkage for various kinds of data, locally stationary wavelet time series, nonstationary multiscale transfer function modeling, density estimation.
Maintained by Guy Nason. Last updated 7 months ago.
72.5 match 5.89 score 41 dependentsrvlenth
emmeans:Estimated Marginal Means, aka Least-Squares Means
Obtain estimated marginal means (EMMs) for many linear, generalized linear, and mixed models. Compute contrasts or linear functions of EMMs, trends, and comparisons of slopes. Plots and other displays. Least-squares means are discussed, and the term "estimated marginal means" is suggested, in Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>.
Maintained by Russell V. Lenth. Last updated 3 days ago.
22.2 match 377 stars 19.19 score 13k scripts 187 dependentsbioc
flowCore:flowCore: Basic structures for flow cytometry data
Provides S4 data structures and basic functions to deal with flow cytometry data.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyinfrastructureflowcytometrycellbasedassayscpp
40.7 match 10.34 score 1.7k scripts 59 dependentsalexkowa
EnvStats:Package for Environmental Statistics, Including US EPA Guidance
Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).
Maintained by Alexander Kowarik. Last updated 17 days ago.
30.8 match 26 stars 12.80 score 2.4k scripts 46 dependentschoonghyunryu
dlookr:Tools for Data Diagnosis, Exploration, Transformation
A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values, outliers, and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and the relationship between the target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputes missing values and outliers, and resolves skewness. And it creates automated reports that support these three tasks.
Maintained by Choonghyun Ryu. Last updated 9 months ago.
35.7 match 212 stars 11.05 score 748 scripts 2 dependentsgjmvanboxtel
gsignal:Signal Processing
R implementation of the 'Octave' package 'signal', containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.
Maintained by Geert van Boxtel. Last updated 2 months ago.
36.4 match 24 stars 10.03 score 133 scripts 34 dependentspetersonr
bestNormalize:Normalizing Transformation Functions
Estimate a suite of normalizing transformations, including a new adaptation of a technique based on ranks which can guarantee normally distributed transformed data if there are no ties: ordered quantile normalization (ORQ). ORQ normalization combines a rank-mapping approach with a shifted logit approximation that allows the transformation to work on data outside the original domain. It is also able to handle new data within the original domain via linear interpolation. The package is built to estimate the best normalizing transformation for a vector consistently and accurately. It implements the Box-Cox transformation, the Yeo-Johnson transformation, three types of Lambert WxF transformations, and the ordered quantile normalization transformation. It estimates the normalization efficacy of other commonly used transformations, and it allows users to specify custom transformations or normalization statistics. Finally, functionality can be integrated into a machine learning workflow via recipes.
Maintained by Ryan Andrew Peterson. Last updated 1 years ago.
34.5 match 39 stars 10.45 score 510 scripts 5 dependentsr-spatial
sf:Simple Features for R
Support for simple feature access, a standardized way to encode and analyze spatial vector data. Binds to 'GDAL' <doi: 10.5281/zenodo.5884351> for reading and writing data, to 'GEOS' <doi: 10.5281/zenodo.11396894> for geometrical operations, and to 'PROJ' <doi: 10.5281/zenodo.5884394> for projection conversions and datum transformations. Uses by default the 's2' package for geometry operations on geodetic (long/lat degree) coordinates.
Maintained by Edzer Pebesma. Last updated 16 days ago.
14.8 match 1.4k stars 22.42 score 117k scripts 1.2k dependentstidyverse
glue:Interpreted String Literals
An implementation of interpreted string literals, inspired by Python's Literal String Interpolation <https://www.python.org/dev/peps/pep-0498/> and Docstrings <https://www.python.org/dev/peps/pep-0257/> and Julia's Triple-Quoted String Literals <https://docs.julialang.org/en/v1.3/manual/strings/#Triple-Quoted-String-Literals-1>.
Maintained by Jennifer Bryan. Last updated 5 months ago.
15.2 match 729 stars 21.76 score 57k scripts 14k dependentsutiligize
CNAIM:Common Network Asset Indices Methodology (CNAIM)
Implementation of the CNAIM standard in R. Contains a series of algorithms which determine the probability of failure, consequences of failure and monetary risk associated with electricity distribution companies' assets such as transformers and cables. Results are visualized in an easy-to-understand risk matrix.
Maintained by Mohsin Vindhani. Last updated 3 years ago.
53.0 match 5 stars 6.17 score 85 scriptsbusiness-science
timetk:A Tool Kit for Working with Time Series
Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.
Maintained by Matt Dancho. Last updated 1 years ago.
coercioncoercion-functionsdata-miningdplyrforecastforecastingforecasting-modelsmachine-learningseries-decompositionseries-signaturetibbletidytidyquanttidyversetimetime-seriestimeseries
22.4 match 625 stars 14.15 score 4.0k scripts 16 dependentsinsightsengineering
teal:Exploratory Web Apps for Analyzing Clinical Trials Data
A 'shiny' based interactive exploration framework for analyzing clinical trials data. 'teal' currently provides a dynamic filtering facility and different data viewers. 'teal' 'shiny' applications are built using standard 'shiny' modules.
Maintained by Dawid Kaledkowski. Last updated 21 days ago.
clinical-trialsnestshinywebapp
24.6 match 197 stars 12.68 score 176 scripts 5 dependentsr-forge
car:Companion to Applied Regression
Functions to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage, 2019.
Maintained by John Fox. Last updated 5 months ago.
20.4 match 15.29 score 43k scripts 901 dependentsbquast
transformer:Implementation of Transformer Deep Neural Network with Vignettes
Transformer is a Deep Neural Network Architecture based i.a. on the Attention mechanism (Vaswani et al. (2017) <doi:10.48550/arXiv.1706.03762>).
Maintained by Bastiaan Quast. Last updated 1 years ago.
59.4 match 1 stars 4.73 score 177 scripts 2 dependentscrunch-io
crunch:Crunch.io Data Tools
The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.
Maintained by Greg Freedman Ellis. Last updated 11 days ago.
26.2 match 9 stars 10.53 score 200 scripts 2 dependentsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 11 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
17.1 match 375 stars 16.11 score 17k scripts 115 dependentsoncoray
power.transform:Location and Scale Invariant Power Transformations
Location- and scale-invariant Box-Cox and Yeo-Johnson power transformations allow for transforming variables with distributions distant from 0 to normality. Transformers are implemented as S4 objects. These allow for transforming new instances to normality after optimising fitting parameters on other data. A test for central normality allows for rejecting transformations that fail to produce a suitably normal distribution, independent of sample number.
Maintained by Alex Zwanenburg. Last updated 6 days ago.
67.3 match 3.88 score 1 scriptspoissonconsulting
extras:Helper Functions for Bayesian Analyses
Functions to 'numericise' 'R' objects (coerce to numeric objects), summarise 'MCMC' (Monte Carlo Markov Chain) samples and calculate deviance residuals as well as 'R' translations of some 'BUGS' (Bayesian Using Gibbs Sampling), 'JAGS' (Just Another Gibbs Sampler), 'STAN' and 'TMB' (Template Model Builder) functions.
Maintained by Nicole Hill. Last updated 2 months ago.
29.7 match 9 stars 8.49 score 15 scripts 16 dependentsoscarkjell
text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Maintained by Oscar Kjell. Last updated 4 days ago.
deep-learningmachine-learningnlptransformersopenjdk
19.2 match 146 stars 13.16 score 436 scripts 1 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 4 days ago.
18.5 match 845 stars 13.57 score 264 scripts 2 dependentsajmcneil
tscopula:Time Series Copula Models
Functions for the analysis of time series using copula models. The package is based on methodology described in the following references. McNeil, A.J. (2021) <doi:10.3390/risks9010014>, Bladt, M., & McNeil, A.J. (2021) <doi:10.1016/j.ecosta.2021.07.004>, Bladt, M., & McNeil, A.J. (2022) <doi:10.1515/demo-2022-0105>.
Maintained by Alexander McNeil. Last updated 24 days ago.
44.8 match 2 stars 5.53 score 12 scriptsmlr-org
mlr3pipelines:Preprocessing Operators and Pipelines for 'mlr3'
Dataflow programming toolkit that enriches 'mlr3' with a diverse set of pipelining operators ('PipeOps') that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can therefore be resampled, benchmarked, and tuned.
Maintained by Martin Binder. Last updated 9 days ago.
baggingdata-sciencedataflow-programmingensemble-learningmachine-learningmlr3pipelinespreprocessingstacking
19.7 match 141 stars 12.36 score 448 scripts 7 dependentsbioc
flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.
This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.
Maintained by Greg Finak. Last updated 10 days ago.
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationzlibopenblascpp
29.2 match 7.89 score 576 scripts 10 dependentsbjw34032
waveslim:Basic Wavelet Routines for One-, Two-, and Three-Dimensional Signal Processing
Basic wavelet routines for time series (1D), image (2D) and array (3D) analysis. The code provided here is based on wavelet methodology developed in Percival and Walden (2000); Gencay, Selcuk and Whitcher (2001); the dual-tree complex wavelet transform (DTCWT) from Kingsbury (1999, 2001) as implemented by Selesnick; and Hilbert wavelet pairs (Selesnick 2001, 2002). All figures in chapters 4-7 of GSW (2001) are reproducible using this package and R code available at the book website(s) below.
Maintained by Brandon Whitcher. Last updated 10 months ago.
28.6 match 3 stars 7.88 score 108 scripts 23 dependentswinvector
cdata:Fluid Data Transformations
Supplies higher-order coordinatized data specification and fluid transform operators that include pivot and anti-pivot as special cases. The methodology is describe in 'Zumel', 2018, "Fluid data reshaping with 'cdata'", <https://winvector.github.io/FluidData/FluidDataReshapingWithCdata.html> , <DOI:10.5281/zenodo.1173299> . This package introduces the idea of explicit control table specification of data transforms. Works on in-memory data or on remote data using 'rquery' and 'SQL' database interfaces.
Maintained by John Mount. Last updated 6 months ago.
27.6 match 46 stars 7.82 score 71 scripts 1 dependentsblasbenito
distantia:Advanced Toolset for Efficient Time Series Dissimilarity Analysis
Fast C++ implementation of Dynamic Time Warping for time series dissimilarity analysis, with applications in environmental monitoring and sensor data analysis, climate science, signal processing and pattern recognition, and financial data analysis. Built upon the ideas presented in Benito and Birks (2020) <doi:10.1111/ecog.04895>, provides tools for analyzing time series of varying lengths and structures, including irregular multivariate time series. Key features include individual variable contribution analysis, restricted permutation tests for statistical significance, and imputation of missing data via GAMs. Additionally, the package provides an ample set of tools to prepare and manage time series data.
Maintained by Blas M. Benito. Last updated 25 days ago.
dissimilaritydynamic-time-warpinglock-steptime-seriescpp
37.2 match 23 stars 5.76 score 11 scriptsdfsp-spirit
freesurferformats:Read and Write 'FreeSurfer' Neuroimaging File Formats
Provides functions to read and write neuroimaging data in various file formats, with a focus on 'FreeSurfer' <http://freesurfer.net/> formats. This includes, but is not limited to, the following file formats: 1) MGH/MGZ format files, which can contain multi-dimensional images or other data. Typically they contain time-series of three-dimensional brain scans acquired by magnetic resonance imaging (MRI). They can also contain vertex-wise measures of surface morphometry data. The MGH format is named after the Massachusetts General Hospital, and the MGZ format is a compressed version of the same format. 2) 'FreeSurfer' morphometry data files in binary 'curv' format. These contain vertex-wise surface measures, i.e., one scalar value for each vertex of a brain surface mesh. These are typically values like the cortical thickness or brain surface area at each vertex. 3) Annotation file format. This contains a brain surface parcellation derived from a cortical atlas. 4) Surface file format. Contains a brain surface mesh, given by a list of vertices and a list of faces.
Maintained by Tim Schäfer. Last updated 6 months ago.
brainbrain-atlasbrain-surfacescurvdtifileformatsfreesurferlabelmeshmghmrineuroimagingparcellationresearchsurfacevoxel
26.2 match 23 stars 8.07 score 25 scripts 8 dependentsjonclayden
RNiftyReg:Image Registration Using the 'NiftyReg' Library
Provides an 'R' interface to the 'NiftyReg' image registration tools <https://github.com/KCL-BMEIS/niftyreg>. Linear and nonlinear registration are supported, in two and three dimensions.
Maintained by Jon Clayden. Last updated 6 months ago.
image-registrationmedical-imagingtransformationscppopenmp
27.5 match 43 stars 7.64 score 50 scripts 5 dependentscran
compositions:Compositional Data Analysis
Provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.
Maintained by K. Gerald van den Boogaart. Last updated 1 years ago.
32.6 match 1 stars 6.35 score 36 dependentsr-forge
Rwave:Time-Frequency analysis of 1-D signals
Rwave is a library of R functions which provide an environment for the Time-Frequency analysis of 1-D signals (and especially for the wavelet and Gabor transforms of noisy signals). It was originally written for Splus by Rene Carmona, Bruno Torresani, and Wen L. Hwang, first at the University of California at Irvine and then at Princeton University. Credit should also be given to Andrea Wang whose functions on the dyadic wavelet transform are included. Rwave is based on the book: "Practical Time-Frequency Analysis: Gabor and Wavelet Transforms with an Implementation in S", by Rene Carmona, Wen L. Hwang and Bruno Torresani, Academic Press, 1998. This package is no longer actively maintained. A C++ rewrite of core functionality is in progress. If you'd like to participate, please contact Christian Gunning.
Maintained by Brandon Whitcher. Last updated 13 years ago.
42.0 match 4.82 score 88 scripts 5 dependentsjosesamos
rolap:Obtaining Star Databases from Flat Tables
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a ROLAP (Relational On-Line Analytical Processing) star database. The main objective of the package is to allow the definition of these transformations easily. The implementation of the multidimensional database obtained can be exported to work with multidimensional analysis tools on spreadsheets or relational databases.
Maintained by Jose Samos. Last updated 1 years ago.
31.4 match 5 stars 6.12 score 25 scripts 1 dependentszejiang-unsw
WASP:Wavelet System Prediction
The wavelet-based variance transformation method is used for system modelling and prediction. It refines predictor spectral representation using Wavelet Theory, which leads to improved model specifications and prediction accuracy. Details of methodologies used in the package can be found in Jiang, Z., Sharma, A., & Johnson, F. (2020) <doi:10.1029/2019WR026962>, Jiang, Z., Rashid, M. M., Johnson, F., & Sharma, A. (2020) <doi:10.1016/j.envsoft.2020.104907>, and Jiang, Z., Sharma, A., & Johnson, F. (2021) <doi:10.1016/J.JHYDROL.2021.126816>.
Maintained by Ze Jiang. Last updated 7 months ago.
predictiontransformationwavelet
28.9 match 9 stars 6.41 score 19 scriptsmlr-org
mlr3torch:Deep Learning with 'mlr3'
Deep Learning library that extends the mlr3 framework by building upon the 'torch' package. It allows to conveniently build, train, and evaluate deep learning models without having to worry about low level details. Custom architectures can be created using the graph language defined in 'mlr3pipelines'.
Maintained by Sebastian Fischer. Last updated 1 months ago.
data-sciencedeep-learningmachine-learningmlr3torch
24.3 match 42 stars 7.63 score 78 scriptspalryalen
transform.hazards:Transforms Cumulative Hazards to Parameter Specified by ODE System
Targets parameters that solve Ordinary Differential Equations (ODEs) driven by a vector of cumulative hazard functions. The package provides a method for estimating these parameters using an estimator defined by a corresponding Stochastic Differential Equation (SDE) system driven by cumulative hazard estimates. By providing cumulative hazard estimates as input, the package gives estimates of the parameter as output, along with pointwise (co)variances derived from an asymptotic expression. Examples of parameters that can be targeted in this way include the survival function, the restricted mean survival function, cumulative incidence functions, among others; see Ryalen, Stensrud, and Røysland (2018) <doi:10.1093/biomet/asy035>, and further applications in Stensrud, Røysland, and Ryalen (2019) <doi:10.1111/biom.13102> and Ryalen et al. (2021) <doi:10.1093/biostatistics/kxab009>.
Maintained by Pål Christie Ryalen. Last updated 2 years ago.
44.1 match 3 stars 4.18 score 5 scriptsmlr-org
paradox:Define and Work with Parameter Spaces for Complex Algorithms
Define parameter spaces, constraints and dependencies for arbitrary algorithms, to program on such spaces. Also includes statistical designs and random samplers. Objects are implemented as 'R6' classes.
Maintained by Martin Binder. Last updated 8 months ago.
experimental-designhyperparametersmlr3transformations
15.8 match 29 stars 11.56 score 316 scripts 38 dependentstrevorld
affiner:A Finer Way to Render 3D Illustrated Objects in 'grid' Using Affine Transformations
Dilate, permute, project, reflect, rotate, shear, and translate 2D and 3D points. Supports parallel projections including oblique projections such as the cabinet projection as well as axonometric projections such as the isometric projection. Use 'grid's "affine transformation" feature to render illustrated flat surfaces.
Maintained by Trevor L. Davis. Last updated 3 months ago.
26.1 match 9 stars 6.91 score 1 scripts 5 dependentseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
19.1 match 118 stars 9.40 score 76 scriptsmlverse
torchvision:Models, Datasets and Transformations for Images
Provides access to datasets, models and preprocessing facilities for deep learning with images. Integrates seamlessly with the 'torch' package and it's 'API' borrows heavily from 'PyTorch' vision package.
Maintained by Daniel Falbel. Last updated 6 months ago.
18.3 match 65 stars 9.74 score 313 scripts 6 dependentsludvigolsen
rearrr:Rearranging Data
Arrange data by a set of methods. Use rearrangers to reorder data points and mutators to change their values. From basic utilities, to centering the greatest value, to swirling in 3-dimensional space, 'rearrr' enables creativity when plotting and experimenting with data.
Maintained by Ludvig Renbo Olsen. Last updated 10 days ago.
arrangeclusterexpandforminggenerateggplot2orderplotting-in-rrollrotateshapingswirltransformations
23.2 match 24 stars 7.26 score 128 scripts 8 dependentskylecaudle
rTensor2:MultiLinear Algebra
A set of tools for basic tensor operators. A tensor in the context of data analysis in a multidimensional array. The tools in this package rely on using any discrete transformation (e.g. Fast Fourier Transform (FFT)). Standard tools included are the Eigenvalue decomposition of a tensor, the QR decomposition and LU decomposition. Other functionality includes the inverse of a tensor and the transpose of a symmetric tensor. Functionality in the package is outlined in Kernfeld et al. (2015) <https://www.sciencedirect.com/science/article/pii/S0024379515004358>.
Maintained by Kyle Caudle. Last updated 12 months ago.
67.3 match 2.48 score 2 scripts 1 dependentsmjskay
ARTool:Aligned Rank Transform
The aligned rank transform for nonparametric factorial ANOVAs as described by Wobbrock, Findlater, Gergle, and Higgins (2011) <doi:10.1145/1978942.1978963>. Also supports aligned rank transform contrasts as described by Elkin, Kay, Higgins, and Wobbrock (2021) <doi:10.1145/3472749.3474784>.
Maintained by Matthew Kay. Last updated 3 years ago.
18.8 match 60 stars 8.64 score 307 scriptspaulnorthrop
rust:Ratio-of-Uniforms Simulation with Transformation
Uses the generalized ratio-of-uniforms (RU) method to simulate from univariate and (low-dimensional) multivariate continuous distributions. The user specifies the log-density, up to an additive constant. The RU algorithm is applied after relocation of mode of the density to zero, and the user can choose a tuning parameter r. For details see Wakefield, Gelfand and Smith (1991) <DOI:10.1007/BF01889987>, Efficient generation of random variates via the ratio-of-uniforms method, Statistics and Computing (1991) 1, 129-133. A Box-Cox variable transformation can be used to make the input density suitable for the RU method and to improve efficiency. In the multivariate case rotation of axes can also be used to improve efficiency. From version 1.2.0 the 'Rcpp' package <https://cran.r-project.org/package=Rcpp> can be used to improve efficiency.
Maintained by Paul J. Northrop. Last updated 7 months ago.
1977bayesian-inferencekindermanmonahanofposterior-samplesratioratio-of-uniformsratio-of-uniforms-methodrcppsimulationtransformationuniformsopenblascpp
22.0 match 7.13 score 36 scripts 7 dependentsrstudio
tfdatasets:Interface to 'TensorFlow' Datasets
Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/guide> for additional details.
Maintained by Tomasz Kalinowski. Last updated 4 days ago.
16.1 match 34 stars 9.32 score 656 scripts 3 dependentsmuschellij2
fslr:Wrapper Functions for 'FSL' ('FMRIB' Software Library) from Functional MRI of the Brain ('FMRIB')
Wrapper functions that interface with 'FSL' <http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/>, a powerful and commonly-used 'neuroimaging' software, using system commands. The goal is to be able to interface with 'FSL' completely in R, where you pass R objects of class 'nifti', implemented by package 'oro.nifti', and the function executes an 'FSL' command and returns an R object of class 'nifti' if desired.
Maintained by John Muschelli. Last updated 1 months ago.
fslfslrneuroimagingneuroimaging-analysisneuroimaging-data-science
18.6 match 41 stars 8.01 score 420 scriptstidyverse
ggplot2:Create Elegant Data Visualisations Using the Grammar of Graphics
A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
Maintained by Thomas Lin Pedersen. Last updated 9 days ago.
data-visualisationvisualisation
5.9 match 6.6k stars 25.10 score 645k scripts 7.5k dependentsdkaschek
dMod:Dynamic Modeling and Parameter Estimation in ODE Models
The framework provides functions to generate ODEs of reaction networks, parameter transformations, observation functions, residual functions, etc. The framework follows the paradigm that derivative information should be used for optimization whenever possible. Therefore, all major functions produce and can handle expressions for symbolic derivatives.
Maintained by Daniel Kaschek. Last updated 10 days ago.
17.5 match 20 stars 8.35 score 251 scriptsspatstat
spatstat.geom:Geometrical Functionality of the 'spatstat' Family
Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)
Maintained by Adrian Baddeley. Last updated 2 days ago.
classes-and-objectsdistance-calculationgeometrygeometry-processingimagesmensurationplottingpoint-patternsspatial-dataspatial-data-analysis
11.9 match 7 stars 12.11 score 241 scripts 227 dependentsjuba
obsplot:Create Charts with 'Observable Plot'
Creation of charts with the 'Observable Plot' 'JavaScript' library.
Maintained by Julien Barnier. Last updated 2 years ago.
27.7 match 60 stars 5.08 score 10 scriptsfabnavarro
rwavelet:Wavelet Analysis
Perform wavelet analysis (orthogonal,translation invariant, tensorial, 1-2-3d transforms, thresholding, block thresholding, linear,...) with applications to data compression, denoising/regression or clustering. The core of the code is a port of 'MATLAB' Wavelab toolbox written by D. Donoho, A. Maleki and M. Shahram (<https://statweb.stanford.edu/~wavelab/>).
Maintained by Navarro Fabien. Last updated 7 months ago.
machine-learningregressionwavelet
27.8 match 5 stars 5.01 score 41 scriptsdcousin3
superb:Summary Plots with Adjusted Error Bars
Computes standard error and confidence interval of various descriptive statistics under various designs and sampling schemes. The main function, superb(), return a plot. It can also be used to obtain a dataframe with the statistics and their precision intervals so that other plotting environments (e.g., Excel) can be used. See Cousineau and colleagues (2021) <doi:10.1177/25152459211035109> or Cousineau (2017) <doi:10.5709/acp-0214-z> for a review as well as Cousineau (2005) <doi:10.20982/tqmp.01.1.p042>, Morey (2008) <doi:10.20982/tqmp.04.2.p061>, Baguley (2012) <doi:10.3758/s13428-011-0123-7>, Cousineau & Laurencelle (2016) <doi:10.1037/met0000055>, Cousineau & O'Brien (2014) <doi:10.3758/s13428-013-0441-z>, Calderini & Harding <doi:10.20982/tqmp.15.1.p001> for specific references.
Maintained by Denis Cousineau. Last updated 2 months ago.
error-barsplottingstatisticssummary-plotssummary-statisticsvisualization
14.5 match 19 stars 9.55 score 155 scripts 2 dependentsrickhelmus
patRoon:Workflows for Mass-Spectrometry Based Non-Target Analysis
Provides an easy-to-use interface to a mass spectrometry based non-target analysis workflow. Various (open-source) tools are combined which provide algorithms for extraction and grouping of features, extraction of MS and MS/MS data, automatic formula and compound annotation and grouping related features to components. In addition, various tools are provided for e.g. data preparation and cleanup, plotting results and automatic reporting.
Maintained by Rick Helmus. Last updated 10 days ago.
mass-spectrometrynon-targetcppopenjdk
22.3 match 65 stars 6.22 score 43 scriptsgacarrillor
vec2dtransf:2D Cartesian Coordinate Transformation
Applies affine and similarity transformations on vector spatial data (sp objects). Transformations can be defined from control points or directly from parameters. If redundant control points are provided Least Squares is applied allowing to obtain residuals and RMSE.
Maintained by German Carrillo. Last updated 3 months ago.
2daffineaffine-transformationcoordinatesleast-squaresrmsesimilarity-transformationssp-objectstransformations
34.7 match 5 stars 3.97 score 37 scriptsmomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
18.3 match 51 stars 7.42 score 346 scriptsasgr
imager:Image Processing Library Based on 'CImg'
Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.
Maintained by Aaron Robotham. Last updated 27 days ago.
9.9 match 17 stars 13.62 score 2.4k scripts 45 dependentsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurélie Siberchicot. Last updated 12 days ago.
8.9 match 39 stars 14.96 score 2.2k scripts 256 dependentsneural-structured-additive-learning
deeptrafo:Fitting Deep Conditional Transformation Models
Allows for the specification of deep conditional transformation models (DCTMs) and ordinal neural network transformation models, as described in Baumann et al (2021) <doi:10.1007/978-3-030-86523-8_1> and Kook et al (2022) <doi:10.1016/j.patcog.2021.108263>. Extensions such as autoregressive DCTMs (Ruegamer et al, 2023, <doi:10.1007/s11222-023-10212-8>) and transformation ensembles (Kook et al, 2022, <doi:10.48550/arXiv.2205.12729>) are implemented. The software package is described in Kook et al (2024, <doi:10.18637/jss.v111.i10>).
Maintained by Lucas Kook. Last updated 2 months ago.
29.4 match 5 stars 4.44 score 11 scriptscanmod
macpan2:Fast and Flexible Compartmental Modelling
Fast and flexible compartmental modelling with Template Model Builder.
Maintained by Steve Walker. Last updated 2 days ago.
compartmental-modelsepidemiologyforecastingmixed-effectsmodel-fittingoptimizationsimulationsimulation-modelingcpp
14.6 match 4 stars 8.89 score 246 scripts 1 dependentsrobjhyndman
forecast:Forecasting Functions for Time Series and Linear Models
Methods and tools for displaying and analysing univariate time series forecasts including exponential smoothing via state space models and automatic ARIMA modelling.
Maintained by Rob Hyndman. Last updated 7 months ago.
forecastforecastingopenblascpp
6.9 match 1.1k stars 18.63 score 16k scripts 239 dependentsrstudio
pointblank:Data Validation and Organization of Metadata for Local and Remote Tables
Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Maintained by Richard Iannone. Last updated 9 days ago.
data-assertionsdata-checkerdata-dictionariesdata-framesdata-inferencedata-managementdata-profilerdata-qualitydata-validationdata-verificationdatabase-tableseasy-to-understandreporting-toolschema-validationtesting-toolsyaml-configuration
12.2 match 932 stars 10.59 score 284 scriptseasystats
datawizard:Easy Data Wrangling and Statistical Transformations
A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.
Maintained by Etienne Bacher. Last updated 9 days ago.
datadplyrhacktoberfestjanitormanipulationreshapetidyrwrangling
8.8 match 222 stars 14.71 score 436 scripts 119 dependentsealdrich
wavelets:Functions for Computing Wavelet Filters, Wavelet Transforms and Multiresolution Analyses
Contains functions for computing and plotting discrete wavelet transforms (DWT) and maximal overlap discrete wavelet transforms (MODWT), as well as their inverses. Additionally, it contains functionality for computing and plotting wavelet transform filters that are used in the above decompositions as well as multiresolution analyses.
Maintained by Eric Aldrich. Last updated 5 years ago.
25.9 match 4 stars 4.90 score 170 scripts 19 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
15.4 match 3 stars 8.20 score 7.8k scripts 11 dependentsr-forge
tm:Text Mining Package
A framework for text mining applications within R.
Maintained by Kurt Hornik. Last updated 26 days ago.
9.8 match 12.96 score 14k scripts 101 dependentschiliubio
file2meco:Transform Files to 'microtable' Object with 'microeco' Package
Transform output files of some tools to the 'microtable' object of 'microtable' class in 'microeco' package. The 'microtable' class is the basic class in 'microeco' package and is necessary for the downstream microbial community data analysis.
Maintained by Chi Liu. Last updated 3 months ago.
20.6 match 25 stars 6.12 score 75 scriptsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
7.8 match 18 stars 16.05 score 1.0k scripts 1.9k dependentsmindthegap-erc
admtools:Estimate and Manipulate Age-Depth Models
Estimate age-depth models from stratigraphic and sedimentological data, and transform data between the time and stratigraphic domain.
Maintained by Niklas Hohmann. Last updated 3 months ago.
age-depth-modelgeochronologysedimentologystratigraphy
17.8 match 4 stars 7.01 score 34 scripts 1 dependentsgmgeorg
LambertW:Probabilistic Models to Analyze and Gaussianize Heavy-Tailed, Skewed Data
Lambert W x F distributions are a generalized framework to analyze skewed, heavy-tailed data. It is based on an input/output system, where the output random variable (RV) Y is a non-linearly transformed version of an input RV X ~ F with similar properties as X, but slightly skewed (heavy-tailed). The transformed RV Y has a Lambert W x F distribution. This package contains functions to model and analyze skewed, heavy-tailed data the Lambert Way: simulate random samples, estimate parameters, compute quantiles, and plot/ print results nicely. The most useful function is 'Gaussianize', which works similarly to 'scale', but actually makes the data Gaussian. A do-it-yourself toolkit allows users to define their own Lambert W x 'MyFavoriteDistribution' and use it in their analysis right away.
Maintained by Georg M. Goerg. Last updated 1 years ago.
gaussianizegaussianize-dataheavy-tailedheavy-tailed-distributionsleptokurtosisnormal-distributionnormalizationskewed-datastatisticscpp
15.2 match 10 stars 8.17 score 78 scripts 13 dependentskylecaudle
TensorTools:Multilinear Algebra
A set of tools for basic tensor operators. A tensor in the context of data analysis in a multidimensional array. The tools in this package rely on using any discrete transformation (e.g. Fast Fourier Transform (FFT)). Standard tools included are the Eigenvalue decomposition of a tensor, the QR decomposition and LU decomposition. Other functionality includes the inverse of a tensor and the transpose of a symmetric tensor. Functionality in the package is outlined in Kernfeld, E., Kilmer, M., and Aeron, S. (2015) <doi:10.1016/j.laa.2015.07.021>.
Maintained by Kyle Caudle. Last updated 5 months ago.
58.2 match 2.00 scorestatdivlab
corncob:Count Regression for Correlated Observations with the Beta-Binomial
Statistical modeling for correlated count data using the beta-binomial distribution, described in Martin et al. (2020) <doi:10.1214/19-AOAS1283>. It allows for both mean and overdispersion covariates.
Maintained by Amy D Willis. Last updated 6 months ago.
12.0 match 105 stars 9.64 score 248 scripts 1 dependentsbioc
philr:Phylogenetic partitioning based ILR transform for metagenomics data
PhILR is short for Phylogenetic Isometric Log-Ratio Transform. This package provides functions for the analysis of compositional data (e.g., data representing proportions of different variables/parts). Specifically this package allows analysis of compositional data where the parts can be related through a phylogenetic tree (as is common in microbiota survey data) and makes available the Isometric Log Ratio transform built from the phylogenetic tree and utilizing a weighted reference measure.
Maintained by Justin Silverman. Last updated 5 months ago.
immunooncologysequencingmicrobiomemetagenomicssoftware
14.5 match 19 stars 7.99 score 95 scriptscoolbutuseless
insitu:By-Reference Routines for Numeric Vectors
Using by-reference semantics, functions in this package are crafted to modify the input objects in-place and avoid allocating new memory. By avoiding memory allocations (and the associated garbage colection these require), operations performed by-reference can be faster than those performed with R's default copy-on-modify semantics.
Maintained by Mike Cheng. Last updated 23 hours ago.
21.8 match 32 stars 5.28 score 10 scriptstrinker
qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis
Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
Maintained by Tyler Rinker. Last updated 4 years ago.
qdapquantitative-discourse-analysistext-analysistext-miningtext-plottingopenjdk
11.9 match 176 stars 9.61 score 1.3k scripts 3 dependentsjokergoo
circlize:Circular Visualization
Circular layout is an efficient way for the visualization of huge amounts of information. Here this package provides an implementation of circular layout generation in R as well as an enhancement of available software. The flexibility of the package is based on the usage of low-level graphics functions such that self-defined high-level graphics can be easily implemented by users for specific purposes. Together with the seamless connection between the powerful computational and visual environment in R, it gives users more convenience and freedom to design figures for better understanding complex patterns behind multiple dimensional data. The package is described in Gu et al. 2014 <doi:10.1093/bioinformatics/btu393>.
Maintained by Zuguang Gu. Last updated 1 years ago.
7.3 match 983 stars 15.62 score 10k scripts 213 dependentschemhouse-group
rchemo:Dimension Reduction, Regression and Discrimination for Chemometrics
Data exploration and prediction with focus on high dimensional data and chemometrics. The package was initially designed about partial least squares regression and discrimination models and variants, in particular locally weighted PLS models (LWPLS). Then, it has been expanded to many other methods for analyzing high dimensional data. The name 'rchemo' comes from the fact that the package is orientated to chemometrics, but most of the provided methods are fully generic to other domains. Functions such as transform(), predict(), coef() and summary() are available. Tuning the predictive models is facilitated by generic functions gridscore() (validation dataset) and gridcv() (cross-validation). Faster versions are also available for models based on latent variables (LVs) (gridscorelv() and gridcvlv()) and ridge regularization (gridscorelb() and gridcvlb()).
Maintained by Marion Brandolini-Bunlon. Last updated 6 months ago.
32.0 match 3 stars 3.52 score 11 scriptsalexioannides
pipeliner:Machine Learning Pipelines for R
A framework for defining 'pipelines' of functions for applying data transformations, model estimation and inverse-transformations, resulting in predicted value generation (or model-scoring) functions that automatically apply the entire pipeline of functions required to go from input to predicted output.
Maintained by Alex Ioannides. Last updated 8 years ago.
data-sciencemachine-learningmachine-learning-pipelinespipelinepredictionstatisticstransform-functionsworkflow
18.9 match 67 stars 5.94 score 26 scriptsrstudio
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
12.9 match 54 stars 8.63 score 221 scripts 3 dependentsbioc
IRanges:Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
7.2 match 22 stars 15.09 score 2.1k scripts 1.8k dependentswillgearty
deeptime:Plotting Tools for Anyone Working in Deep Time
Extends the functionality of other plotting packages (notably 'ggplot2') to help facilitate the plotting of data over long time intervals, including, but not limited to, geological, evolutionary, and ecological data. The primary goal of 'deeptime' is to enable users to add highly customizable timescales to their visualizations. Other functions are also included to assist with other areas of deep time visualization.
Maintained by William Gearty. Last updated 3 months ago.
geologyggplot2paleontologyvisualization
10.1 match 92 stars 10.61 score 207 scripts 3 dependentsmsberends
AMR:Antimicrobial Resistance Data Analysis
Functions to simplify and standardise antimicrobial resistance (AMR) data analysis and to work with microbial and antimicrobial properties by using evidence-based methods, as described in <doi:10.18637/jss.v104.i03>.
Maintained by Matthijs S. Berends. Last updated 3 hours ago.
amrantimicrobial-dataepidemiologymicrobiologysoftware
9.0 match 92 stars 11.87 score 182 scripts 6 dependentstidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 13 days ago.
4.3 match 4.8k stars 24.68 score 659k scripts 7.8k dependentspaleolimbot
wk:Lightweight Well-Known Geometry Parsing
Provides a minimal R and C++ API for parsing well-known binary and well-known text representation of geometries to and from R-native formats. Well-known binary is compact and fast to parse; well-known text is human-readable and is useful for writing tests. These formats are useful in R only if the information they contain can be accessed in R, for which high-performance functions are provided here.
Maintained by Dewey Dunnington. Last updated 5 months ago.
8.1 match 47 stars 12.85 score 89 scripts 1.2k dependentstychelab
CoSMoS:Complete Stochastic Modelling Solution
Makes univariate, multivariate, or random fields simulations precise and simple. Just select the desired time series or random fields’ properties and it will do the rest. CoSMoS is based on the framework described in Papalexiou (2018, <doi:10.1016/j.advwatres.2018.02.013>), extended for random fields in Papalexiou and Serinaldi (2020, <doi:10.1029/2019WR026331>), and further advanced in Papalexiou et al. (2021, <doi:10.1029/2020WR029466>) to allow fine-scale space-time simulation of storms (or even cyclone-mimicking fields).
Maintained by Kevin Shook. Last updated 4 years ago.
14.0 match 11 stars 7.10 score 77 scriptsropensci
pangoling:Access to Large Language Model Predictions
Provides access to word predictability estimates using large language models (LLMs) based on 'transformer' architectures via integration with the 'Hugging Face' ecosystem. The package interfaces with pre-trained neural networks and supports both causal/auto-regressive LLMs (e.g., 'GPT-2'; Radford et al., 2019) and masked/bidirectional LLMs (e.g., 'BERT'; Devlin et al., 2019, <doi:10.48550/arXiv.1810.04805>) to compute the probability of words, phrases, or tokens given their linguistic context. By enabling a straightforward estimation of word predictability, the package facilitates research in psycholinguistics, computational linguistics, and natural language processing (NLP).
Maintained by Bruno Nicenboim. Last updated 5 days ago.
nlppsycholinguisticstransformers
20.3 match 8 stars 4.90 scoregagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 1 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
5.4 match 309 stars 18.31 score 10k scripts 8.6k dependentspsychbruce
FMAT:The Fill-Mask Association Test
The Fill-Mask Association Test ('FMAT') <doi:10.1037/pspa0000396> is an integrative and probability-based method using Masked Language Models to measure conceptual associations (e.g., attitudes, biases, stereotypes, social norms, cultural values) as propositions in natural language. Supported language models include 'BERT' <doi:10.48550/arXiv.1810.04805> and its variants available at 'Hugging Face' <https://huggingface.co/models?pipeline_tag=fill-mask>. Methodological references and installation guidance are provided at <https://psychbruce.github.io/FMAT/>.
Maintained by Han-Wu-Shuang Bao. Last updated 5 months ago.
aiartificial-intelligencebertbert-modelbert-modelscontextualized-representationfill-in-the-blankfill-maskhuggingfacelanguage-modellanguage-modelslarge-language-modelsmasked-language-modelsnatural-language-processingnatural-language-understandingnlppretrained-modelstransformertransformers
20.0 match 12 stars 4.82 score 2 scriptsmodeloriented
rSAFE:Surrogate-Assisted Feature Extraction
Provides a model agnostic tool for white-box model trained on features extracted from a black-box model. For more information see: Gosiewska et al. (2020) <doi:10.1016/j.dss.2021.113556>.
Maintained by Alicja Gosiewska. Last updated 3 years ago.
feature-engineeringfeature-extractionimlinterpretabilitymachine-learningxai
14.1 match 28 stars 6.79 score 44 scriptsjbkunst
highcharter:A Wrapper for the 'Highcharts' Library
A wrapper for the 'Highcharts' library including shortcut functions to plot R objects. 'Highcharts' <https://www.highcharts.com/> is a charting library offering numerous chart types with a simple configuration syntax.
Maintained by Joshua Kunst. Last updated 1 years ago.
highchartshtmlwidgetsshinyshiny-rvisualizationwrapper
6.8 match 725 stars 13.93 score 4.9k scripts 18 dependentsthomasp85
ggforce:Accelerating 'ggplot2'
The aim of 'ggplot2' is to aid in visual data investigations. This focus has led to a lack of facilities for composing specialised plots. 'ggforce' aims to be a collection of mainly new stats and geoms that fills this gap. All additional functionality is aimed to come through the official extension system so using 'ggforce' should be a stable experience.
Maintained by Thomas Lin Pedersen. Last updated 1 years ago.
ggplot-extensionggplot2visualizationcpp
6.0 match 920 stars 15.83 score 9.3k scripts 293 dependentslbbe-software
fitdistrplus:Help to Fit of a Parametric Distribution to Non-Censored or Censored Data
Extends the fitdistr() function (of the MASS package) with several functions to help the fit of a parametric distribution to non-censored or censored data. Censored data may contain left censored, right censored and interval censored values, with several lower and upper bounds. In addition to maximum likelihood estimation (MLE), the package provides moment matching (MME), quantile matching (QME), maximum goodness-of-fit estimation (MGE) and maximum spacing estimation (MSE) methods (available only for non-censored data). Weighted versions of MLE, MME, QME and MSE are available. See e.g. Casella & Berger (2002), Statistical inference, Pacific Grove, for a general introduction to parametric estimation.
Maintained by Aurélie Siberchicot. Last updated 12 days ago.
5.9 match 54 stars 16.15 score 4.5k scripts 153 dependentsbioc
limma:Linear Models for Microarray and Omics Data
Data analysis, linear models and differential expression for omics data.
Maintained by Gordon Smyth. Last updated 5 days ago.
exonarraygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentdataimportbayesianclusteringregressiontimecoursemicroarraymicrornaarraymrnamicroarrayonechannelproprietaryplatformstwochannelsequencingrnaseqbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrolbiomedicalinformaticscellbiologycheminformaticsepigeneticsfunctionalgenomicsgeneticsimmunooncologymetabolomicsproteomicssystemsbiologytranscriptomics
6.9 match 13.81 score 16k scripts 585 dependentsbioc
GenomicRanges:Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Maintained by Hervé Pagès. Last updated 4 months ago.
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
5.3 match 44 stars 17.75 score 13k scripts 1.3k dependentsbioc
EBImage:Image processing and analysis toolbox for R
EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.
Maintained by Andrzej Oleś. Last updated 5 months ago.
visualizationbioinformaticsimage-analysisimage-processingcpp
7.3 match 71 stars 12.89 score 1.5k scripts 33 dependentsvanzanden
ggsolvencyii:A 'ggplot2'-Plot of Composition of Solvency II SCR: SF and IM
An implementation of 'ggplot2'-methods to present the composition of Solvency II Solvency Capital Requirement (SCR) as a series of concentric circle-parts. Solvency II (Solvency 2) is European insurance legislation, coming in force by the delegated acts of October 10, 2014. <https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ%3AL%3A2015%3A012%3ATOC>. Additional files, defining the structure of the Standard Formula (SF) method of the SCR-calculation are provided. The structure files can be adopted for localization or for insurance companies who use Internal Models (IM). Options are available for combining smaller components, horizontal and vertical scaling, rotation, and plotting only some circle-parts. With outlines and connectors several SCR-compositions can be compared, for example in ORSA-scenarios (Own Risk and Solvency Assessment).
Maintained by Marco van Zanden. Last updated 6 years ago.
16.7 match 3 stars 5.58 score 63 scriptsdavidcsterratt
geometry:Mesh Generation and Surface Tessellation
Makes the 'Qhull' library <http://www.qhull.org> available in R, in a similar manner as in Octave and MATLAB. Qhull computes convex hulls, Delaunay triangulations, halfspace intersections about a point, Voronoi diagrams, furthest-site Delaunay triangulations, and furthest-site Voronoi diagrams. It runs in 2D, 3D, 4D, and higher dimensions. It implements the Quickhull algorithm for computing the convex hull. Qhull does not support constrained Delaunay triangulations, or mesh generation of non-convex objects, but the package does include some R functions that allow for this.
Maintained by David C. Sterratt. Last updated 1 months ago.
7.2 match 16 stars 12.98 score 776 scripts 139 dependentsharrelfe
Hmisc:Harrell Miscellaneous
Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.
Maintained by Frank E Harrell Jr. Last updated 7 hours ago.
5.2 match 210 stars 17.61 score 17k scripts 750 dependentsedwinkipruto
mfp2:Multivariable Fractional Polynomial Models with Extensions
Multivariable fractional polynomial algorithm simultaneously selects variables and functional forms in both generalized linear models and Cox proportional hazard models. Key references are Royston and Altman (1994) <doi:10.2307/2986270> and Royston and Sauerbrei (2008, ISBN:978-0-470-02842-1). In addition, it can model a sigmoid relationship between variable x and an outcome variable y using the approximate cumulative distribution transformation proposed by Royston (2014) <doi:10.1177/1536867X1401400206>. This feature distinguishes it from a standard fractional polynomial function, which lacks the ability to achieve such modeling.
Maintained by Edwin Kipruto. Last updated 10 months ago.
17.4 match 3 stars 5.26 score 4 scripts 2 dependentstidyverts
fable:Forecasting Models for Tidy Time Series
Provides a collection of commonly used univariate and multivariate time series forecasting models including automatically selected exponential smoothing (ETS) and autoregressive integrated moving average (ARIMA) models. These models work within the 'fable' framework provided by the 'fabletools' package, which provides the tools to evaluate, visualise, and combine models in a workflow consistent with the tidyverse.
Maintained by Mitchell OHara-Wild. Last updated 4 months ago.
6.8 match 565 stars 13.52 score 2.1k scripts 6 dependentsrebeccasalles
TSPred:Functions for Benchmarking Time Series Prediction
Functions for defining and conducting a time series prediction process including pre(post)processing, decomposition, modelling, prediction and accuracy assessment. The generated models and its yielded prediction errors can be used for benchmarking other time series prediction methods and for creating a demand for the refinement of such methods. For this purpose, benchmark data from prediction competitions may be used.
Maintained by Rebecca Pontes Salles. Last updated 4 years ago.
benchmarkinglinear-modelsmachine-learningnonstationaritytime-series-forecasttime-series-prediction
16.5 match 24 stars 5.53 score 94 scripts 1 dependentsepiforecasts
scoringutils:Utilities for Scoring and Assessing Predictions
Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.
Maintained by Nikos Bosse. Last updated 13 days ago.
forecast-evaluationforecasting
8.0 match 52 stars 11.37 score 326 scripts 7 dependentsswihart
rmutil:Utilities for Nonlinear Regression and Repeated Measurements Models
A toolkit of functions for nonlinear regression and repeated measurements not to be used by itself but called by other Lindsey packages such as 'gnlm', 'stable', 'growth', 'repeated', and 'event' (available at <https://www.commanster.eu/rcode.html>).
Maintained by Bruce Swihart. Last updated 2 years ago.
10.9 match 1 stars 8.35 score 358 scripts 70 dependentsbluefoxr
COINr:Composite Indicator Construction and Analysis
A comprehensive high-level package, for composite indicator construction and analysis. It is a "development environment" for composite indicators and scoreboards, which includes utilities for construction (indicator selection, denomination, imputation, data treatment, normalisation, weighting and aggregation) and analysis (multivariate analysis, correlation plotting, short cuts for principal component analysis, global sensitivity analysis, and more). A composite indicator is completely encapsulated inside a single hierarchical list called a "coin". This allows a fast and efficient work flow, as well as making quick copies, testing methodological variations and making comparisons. It also includes many plotting options, both statistical (scatter plots, distribution plots) as well as for presenting results.
Maintained by William Becker. Last updated 2 months ago.
9.8 match 26 stars 9.07 score 73 scripts 1 dependentsr-forge
copula:Multivariate Dependence with Copulas
Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
Maintained by Martin Maechler. Last updated 11 days ago.
7.5 match 11.83 score 1.2k scripts 86 dependentsjonathanlees
RSEIS:Seismic Time Series Analysis Tools
Multiple interactive codes to view and analyze seismic data, via spectrum analysis, wavelet transforms, particle motion, hodograms. Includes general time-series tools, plotting, filtering, interactive display.
Maintained by Jonathan M. Lees. Last updated 6 months ago.
20.6 match 3 stars 4.27 score 262 scripts 4 dependentscefet-rj-dal
daltoolbox:Leveraging Experiment Lines to Data Analytics
The natural increase in the complexity of current research experiments and data demands better tools to enhance productivity in Data Analytics. The package is a framework designed to address the modern challenges in data analytics workflows. The package is inspired by Experiment Line concepts. It aims to provide seamless support for users in developing their data mining workflows by offering a uniform data model and method API. It enables the integration of various data mining activities, including data preprocessing, classification, regression, clustering, and time series prediction. It also offers options for hyper-parameter tuning and supports integration with existing libraries and languages. Overall, the package provides researchers with a comprehensive set of functionalities for data science, promoting ease of use, extensibility, and integration with various tools and libraries. Information on Experiment Line is based on Ogasawara et al. (2009) <doi:10.1007/978-3-642-02279-1_20>.
Maintained by Eduardo Ogasawara. Last updated 1 months ago.
13.1 match 1 stars 6.65 score 536 scripts 4 dependentshypertidy
PROJ:Generic Coordinate System Transformations Using 'PROJ'
A wrapper around the generic coordinate transformation software 'PROJ' that transforms coordinates from one coordinate reference system ('CRS') to another. This includes cartographic projections as well as geodetic transformations. The intention is for this package to be used by user-packages such as 'reproj', and that the older 'PROJ.4' and version 5 pathways be provided by the 'proj4' package.
Maintained by Michael D. Sumner. Last updated 9 months ago.
8.1 match 16 stars 10.53 score 82 scripts 27 dependentsprodriguezsosa
conText:'a la Carte' on Text (ConText) Embedding Regression
A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.
Maintained by Pedro L. Rodriguez. Last updated 11 months ago.
9.1 match 104 stars 9.40 score 1.7k scriptsbeckerbenj
eatGADS:Data Management of Large Hierarchical Data
Import 'SPSS' data, handle and change 'SPSS' meta data, store and access large hierarchical data in 'SQLite' data bases.
Maintained by Benjamin Becker. Last updated 23 days ago.
11.1 match 1 stars 7.36 score 34 scripts 1 dependentsrivolli
utiml:Utilities for Multi-Label Learning
Multi-label learning strategies and others procedures to support multi- label classification in R. The package provides a set of multi-label procedures such as sampling methods, transformation strategies, threshold functions, pre-processing techniques and evaluation metrics. A complete overview of the matter can be seen in Zhang, M. and Zhou, Z. (2014) <doi:10.1109/TKDE.2013.39> and Gibaja, E. and Ventura, S. (2015) A Tutorial on Multi-label Learning.
Maintained by Adriano Rivolli. Last updated 4 years ago.
12.8 match 28 stars 6.39 score 87 scriptskvasilopoulos
transx:Transform Univariate Time Series
Univariate time series operations that follow an opinionated design. The main principle of 'transx' is to keep the number of observations the same. Operations that reduce this number have to fill the observations gap.
Maintained by Kostas Vasilopoulos. Last updated 4 years ago.
detrendfiltersoutlierstime-seriestransx
19.0 match 3 stars 4.29 score 13 scriptsmjockers
syuzhet:Extracts Sentiment and Sentiment-Derived Plot Arcs from Text
Extracts sentiment and sentiment-derived plot arcs from text using a variety of sentiment dictionaries conveniently packaged for consumption by R users. Implemented dictionaries include "syuzhet" (default) developed in the Nebraska Literary Lab "afinn" developed by Finn Årup Nielsen, "bing" developed by Minqing Hu and Bing Liu, and "nrc" developed by Mohammad, Saif M. and Turney, Peter D. Applicable references are available in README.md and in the documentation for the "get_sentiment" function. The package also provides a hack for implementing Stanford's coreNLP sentiment parser. The package provides several methods for plot arc normalization.
Maintained by Matthew Jockers. Last updated 2 years ago.
6.3 match 336 stars 12.92 score 1.4k scripts 31 dependentssvkucheryavski
mdatools:Multivariate Data Analysis for Chemometrics
Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.
Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.
11.0 match 35 stars 7.37 score 220 scripts 1 dependentseagerai
tfaddons:Interface to 'TensorFlow SIG Addons'
'TensorFlow SIG Addons' <https://www.tensorflow.org/addons> is a repository of community contributions that conform to well-established API patterns, but implement new functionality not available in core 'TensorFlow'. 'TensorFlow' natively supports a large number of operators, layers, metrics, losses, optimizers, and more. However, in a fast moving field like Machine Learning, there are many interesting new developments that cannot be integrated into core 'TensorFlow' (because their broad applicability is not yet clear, or it is mostly used by a smaller subset of the community).
Maintained by Turgut Abdullayev. Last updated 3 years ago.
deep-learningkerasneural-networkstensorflowtensorflow-addonstfa
15.6 match 20 stars 5.20 score 16 scriptslrberge
fixest:Fast Fixed-Effects Estimations
Fast and user-friendly estimation of econometric models with multiple fixed-effects. Includes ordinary least squares (OLS), generalized linear models (GLM) and the negative binomial. The core of the package is based on optimized parallel C++ code, scaling especially well for large data sets. The method to obtain the fixed-effects coefficients is based on Berge (2018) <https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf>. Further provides tools to export and view the results of several estimations with intuitive design to cluster the standard-errors.
Maintained by Laurent Berge. Last updated 7 months ago.
5.5 match 387 stars 14.69 score 3.8k scripts 25 dependentsinsightsengineering
formatters:ASCII Formatting for Values and Tables
We provide a framework for rendering complex tables to ASCII, and a set of formatters for transforming values or sets of values into ASCII-ready display strings.
Maintained by Joe Zhu. Last updated 2 months ago.
7.9 match 17 stars 10.19 score 22 scripts 20 dependentsfacebookexperimental
Robyn:Semi-Automated Marketing Mix Modeling (MMM) from Meta Marketing Science
Semi-Automated Marketing Mix Modeling (MMM) aiming to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocation and diminishing returns curves and allows ground-truth calibration to account for causation.
Maintained by Gufeng Zhou. Last updated 19 days ago.
adstockingbudget-allocationcost-response-curveeconometricsevolutionary-algorithmgradient-based-optimisationhyperparameter-optimizationmarketing-mix-modelingmarketing-mix-modellingmarketing-sciencemmmridge-regression
7.8 match 1.2k stars 10.32 score 95 scriptsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 29 days ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
6.8 match 55 stars 11.77 score 1.2k scripts 2 dependentspmartr
pmartR:Panomics Marketplace - Quality Control and Statistical Analysis for Panomics Data
Provides functionality for quality control processing and statistical analysis of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level), lipidomic, and metabolomic data, as well as RNA-seq based count data and nuclear magnetic resonance (NMR) data. This includes data transformation, specification of groups that are to be compared against each other, filtering of features and/or samples, data normalization, data summarization (correlation, PCA), and statistical comparisons between defined groups. Implements methods described in: Webb-Robertson et al. (2014) <doi:10.1074/mcp.M113.030932>. Webb-Robertson et al. (2011) <doi:10.1002/pmic.201100078>. Matzke et al. (2011) <doi:10.1093/bioinformatics/btr479>. Matzke et al. (2013) <doi:10.1002/pmic.201200269>. Polpitiya et al. (2008) <doi:10.1093/bioinformatics/btn217>. Webb-Robertson et al. (2010) <doi:10.1021/pr1005247>.
Maintained by Lisa Bramer. Last updated 3 days ago.
data-summarizationlipidsmass-spectrometrymetabolitesmetabolomics-datapeptidesproteinsrna-seq-analysisopenblascpp
10.4 match 40 stars 7.69 score 144 scriptsjeffreyevans
spatialEco:Spatial Analysis and Modelling Utilities
Utilities to support spatial data manipulation, query, sampling and modelling in ecological applications. Functions include models for species population density, spatial smoothing, multivariate separability, point process model for creating pseudo- absences and sub-sampling, Quadrant-based sampling and analysis, auto-logistic modeling, sampling models, cluster optimization, statistical exploratory tools and raster-based metrics.
Maintained by Jeffrey S. Evans. Last updated 13 days ago.
biodiversityconservationecologyr-spatialrasterspatialvector
8.3 match 110 stars 9.55 score 736 scripts 2 dependentsiandryden
shapes:Statistical Shape Analysis
Routines for the statistical analysis of landmark shapes, including Procrustes analysis, graphical displays, principal components analysis, permutation and bootstrap tests, thin-plate spline transformation grids and comparing covariance matrices. See Dryden, I.L. and Mardia, K.V. (2016). Statistical shape analysis, with Applications in R (2nd Edition), John Wiley and Sons.
Maintained by Ian Dryden. Last updated 4 months ago.
9.3 match 7 stars 8.50 score 225 scripts 24 dependentsr-forge
mlt:Most Likely Transformations
Likelihood-based estimation of conditional transformation models via the most likely transformation approach described in Hothorn et al. (2018) <DOI:10.1111/sjos.12291> and Hothorn (2020) <DOI:10.18637/jss.v092.i01>. Shift-scale (Siegfried et al, 2023, <DOI:10.1080/00031305.2023.2203177>) and multivariate (Klein et al, 2022, <DOI:10.1111/sjos.12501>) transformation models are part of this package. A package vignette is available from <DOI:10.32614/CRAN.package.mlt.docreg> and more convenient user interfaces to many models from <DOI:10.32614/CRAN.package.tram>.
Maintained by Torsten Hothorn. Last updated 4 days ago.
10.8 match 7.31 score 41 scripts 10 dependentsampel-leipzig
zlog:Z(log) Transformation for Laboratory Measurements
Transformation of laboratory measurements into z or z(log)-value based on given or empirical reference limits as proposed in Hoffmann et al. 2017 <doi:10.1515/labmed-2016-0087>.
Maintained by Sebastian Gibb. Last updated 2 years ago.
laboratory-measurementstransformationzlog
19.7 match 2 stars 4.00 scorehwborchers
pracma:Practical Numerical Math Functions
Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.
Maintained by Hans W. Borchers. Last updated 1 years ago.
6.4 match 29 stars 12.34 score 6.6k scripts 931 dependentsguido-s
meta:General Package for Meta-Analysis
User-friendly general package providing standard methods for meta-analysis and supporting Schwarzer, Carpenter, and Rücker <DOI:10.1007/978-3-319-21416-0>, "Meta-Analysis with R" (2015): - common effect and random effects meta-analysis; - several plots (forest, funnel, Galbraith / radial, L'Abbe, Baujat, bubble); - three-level meta-analysis model; - generalised linear mixed model; - logistic regression with penalised likelihood for rare events; - Hartung-Knapp method for random effects model; - Kenward-Roger method for random effects model; - prediction interval; - statistical tests for funnel plot asymmetry; - trim-and-fill method to evaluate bias in meta-analysis; - meta-regression; - cumulative meta-analysis and leave-one-out meta-analysis; - import data from 'RevMan 5'; - produce forest plot summarising several (subgroup) meta-analyses.
Maintained by Guido Schwarzer. Last updated 25 days ago.
5.2 match 84 stars 14.84 score 2.3k scripts 29 dependentscran
MASS:Support Functions and Datasets for Venables and Ripley's MASS
Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002).
Maintained by Brian Ripley. Last updated 16 days ago.
7.3 match 19 stars 10.53 score 11k dependentsniklashohmann
DAIME:Effects of Changing Deposition Rates
Reverse and model the effects of changing deposition rates on geological data and rates. Based on Hohmann (2018) <doi:10.13140/RG.2.2.23372.51841> .
Maintained by Niklas Hohmann. Last updated 5 years ago.
25.2 match 3.00 scoregadenbuie
epoxy:String Interpolation for Documents, Reports and Apps
Extra strength 'glue' for data-driven templates. String interpolation for 'Shiny' apps or 'R Markdown' and 'knitr'-powered 'Quarto' documents, built on the 'glue' and 'whisker' packages.
Maintained by Garrick Aden-Buie. Last updated 11 months ago.
glueknitrknitr-enginequartormarkdownrmdshinytemplate
9.0 match 218 stars 8.43 score 312 scriptsbioc
clusterExperiment:Compare Clusterings for Single-Cell Sequencing
Provides functionality for running and comparing many different clusterings of single-cell sequencing data or other large mRNA Expression data sets.
Maintained by Elizabeth Purdom. Last updated 5 months ago.
clusteringrnaseqsequencingsoftwaresinglecellcpp
7.8 match 39 stars 9.63 score 192 scripts 1 dependentsjdtuck
fdasrvf:Elastic Functional Data Analysis
Performs alignment, PCA, and modeling of multidimensional and unidimensional functions using the square-root velocity framework (Srivastava et al., 2011 <doi:10.48550/arXiv.1103.3817> and Tucker et al., 2014 <DOI:10.1016/j.csda.2012.12.001>). This framework allows for elastic analysis of functional data through phase and amplitude separation.
Maintained by J. Derek Tucker. Last updated 27 days ago.
9.6 match 11 stars 7.74 score 83 scripts 3 dependentsdrjp
nimbleNoBounds:Transformed Distributions for Improved MCMC Efficiency
A collection of common univariate bounded probability distributions transformed to the unbounded real line, for the purpose of increased MCMC efficiency.
Maintained by David Pleydell. Last updated 9 months ago.
20.1 match 1 stars 3.70 score 2 scriptsbioc
MAST:Model-based Analysis of Single Cell Transcriptomics
Methods and models for handling zero-inflated single cell assay data.
Maintained by Andrew McDavid. Last updated 5 months ago.
geneexpressiondifferentialexpressiongenesetenrichmentrnaseqtranscriptomicssinglecell
5.8 match 230 stars 12.75 score 1.8k scripts 5 dependentscran
ICS:Tools for Exploring Multivariate Data via ICS/ICA
Implementation of Tyler, Critchley, Duembgen and Oja's (JRSS B, 2009, <doi:10.1111/j.1467-9868.2009.00706.x>) and Oja, Sirkia and Eriksson's (AJS, 2006, <https://www.ajs.or.at/index.php/ajs/article/view/vol35,%20no2%263%20-%207>) method of two different scatter matrices to obtain an invariant coordinate system or independent components, depending on the underlying assumptions.
Maintained by Klaus Nordhausen. Last updated 1 years ago.
14.2 match 5.21 score 17 dependentsbxc147
Epi:Statistical Analysis in Epidemiology
Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data. In particular representation, manipulation, rate estimation and simulation for multistate data - the Lexis suite of functions, which includes interfaces to 'mstate', 'etm' and 'cmprsk' packages. Contains functions for Age-Period-Cohort and Lee-Carter modeling and a function for interval censored data and some useful functions for tabulation and plotting, as well as a number of epidemiological data sets.
Maintained by Bendix Carstensen. Last updated 2 months ago.
7.7 match 4 stars 9.65 score 708 scripts 11 dependentsdrkowal
SeBR:Semiparametric Bayesian Regression Analysis
Monte Carlo sampling algorithms for semiparametric Bayesian regression analysis. These models feature a nonparametric (unknown) transformation of the data paired with widely-used regression models including linear regression, spline regression, quantile regression, and Gaussian processes. The transformation enables broader applicability of these key models, including for real-valued, positive, and compactly-supported data with challenging distributional features. The samplers prioritize computational scalability and, for most cases, Monte Carlo (not MCMC) sampling for greater efficiency. Details of the methods and algorithms are provided in Kowal and Wu (2024) <doi:10.1080/01621459.2024.2395586>.
Maintained by Dan Kowal. Last updated 6 days ago.
17.2 match 1 stars 4.30 score 3 scriptsropensci
magick:Advanced Graphics and Image-Processing in R
Bindings to 'ImageMagick': the most comprehensive open-source image processing library available. Supports many common formats (png, jpeg, tiff, pdf, etc) and manipulations (rotate, scale, crop, trim, flip, blur, etc). All operations are vectorized via the Magick++ STL meaning they operate either on a single frame or a series of frames for working with layers, collages, or animation. In RStudio images are automatically previewed when printed to the console, resulting in an interactive editing environment. The latest version of the package includes a native graphics device for creating in-memory graphics or drawing onto images using pixel coordinates.
Maintained by Jeroen Ooms. Last updated 20 days ago.
image-manipulationimage-processingimagemagickcpp
4.3 match 468 stars 17.31 score 9.0k scripts 256 dependentsmartin3141
spant:MR Spectroscopy Analysis Tools
Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.
Maintained by Martin Wilson. Last updated 30 days ago.
brainmrimrsmrshubspectroscopyfortran
8.6 match 25 stars 8.52 score 81 scriptszarquon42b
Morpho:Calculations and Visualisations Related to Geometric Morphometrics
A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.
Maintained by Stefan Schlager. Last updated 5 months ago.
7.3 match 51 stars 10.00 score 218 scripts 13 dependentstidyverse
purrr:Functional Programming Tools
A complete and consistent functional programming toolkit for R.
Maintained by Hadley Wickham. Last updated 1 months ago.
3.3 match 1.3k stars 22.12 score 59k scripts 6.9k dependentsr-forge
tramvs:Optimal Subset Selection for Transformation Models
Greedy optimal subset selection for transformation models (Hothorn et al., 2018, <doi:10.1111/sjos.12291> ) based on the abess algorithm (Zhu et al., 2020, <doi:10.1073/pnas.2014241117> ). Applicable to models from packages 'tram' and 'cotram'. Application to shift-scale transformation models are described in Siegfried et al. (2024, <doi:10.1080/00031305.2023.2203177>).
Maintained by Lucas Kook. Last updated 4 days ago.
17.6 match 4.12 score 5 scriptstalgalili
heatmaply:Interactive Cluster Heat Maps Using 'plotly' and 'ggplot2'
Create interactive cluster 'heatmaps' that can be saved as a stand- alone HTML file, embedded in 'R Markdown' documents or in a 'Shiny' app, and available in the 'RStudio' viewer pane. Hover the mouse pointer over a cell to show details or drag a rectangle to zoom. A 'heatmap' is a popular graphical method for visualizing high-dimensional data, in which a table of numbers are encoded as a grid of colored cells. The rows and columns of the matrix are ordered to highlight patterns and are often accompanied by 'dendrograms'. 'Heatmaps' are used in many fields for visualizing observations, correlations, missing values patterns, and more. Interactive 'heatmaps' allow the inspection of specific value by hovering the mouse over a cell, as well as zooming into a region of the 'heatmap' by dragging a rectangle around the relevant area. This work is based on the 'ggplot2' and 'plotly.js' engine. It produces similar 'heatmaps' to 'heatmap.2' with the advantage of speed ('plotly.js' is able to handle larger size matrix), the ability to zoom from the 'dendrogram' panes, and the placing of factor variables in the sides of the 'heatmap'.
Maintained by Tal Galili. Last updated 8 months ago.
d3-heatmapdendextenddendrogramggplot2heatmapplotly
5.1 match 386 stars 14.21 score 2.0k scripts 45 dependentsbioc
ggcyto:Visualize Cytometry data with ggplot
With the dedicated fortify method implemented for flowSet, ncdfFlowSet and GatingSet classes, both raw and gated flow cytometry data can be plotted directly with ggplot. ggcyto wrapper and some customed layers also make it easy to add gates and population statistics to the plot.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyflowcytometrycellbasedassaysinfrastructurevisualization
6.4 match 58 stars 11.25 score 362 scripts 5 dependentsmaarten14c
rice:Radiocarbon Equations
Provides functions for the calibration of radiocarbon dates, as well as options to calculate different radiocarbon realms (C14 age, F14C, pMC, D14C) and estimating the effects of contamination or local reservoir offsets (Reimer and Reimer 2001 <doi:10.1017/S0033822200038339>). The methods follow long-established recommendations such as Stuiver and Polach (1977) <doi:10.1017/S0033822200003672> and Reimer et al. (2004) <doi:10.1017/S0033822200033154>. This package complements the data package 'rintcal'.
Maintained by Maarten Blaauw. Last updated 2 months ago.
11.7 match 1 stars 6.13 score 13 scripts 4 dependentsr-lum
Luminescence:Comprehensive Luminescence Dating Data Analysis
A collection of various R functions for the purpose of Luminescence dating data analysis. This includes, amongst others, data import, export, application of age models, curve deconvolution, sequence analysis and plotting of equivalent dose distributions.
Maintained by Sebastian Kreutzer. Last updated 1 days ago.
bayesian-statisticsdata-sciencegeochronologyluminescenceluminescence-datingopen-scienceoslplottingradiofluorescencetlxsygcpp
6.6 match 15 stars 10.77 score 178 scripts 8 dependentsinlabru-org
inlabru:Bayesian Latent Gaussian Modelling using INLA and Extensions
Facilitates spatial and general latent Gaussian modeling using integrated nested Laplace approximation via the INLA package (<https://www.r-inla.org>). Additionally, extends the GAM-like model class to more general nonlinear predictor expressions, and implements a log Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data. Model components are specified with general inputs and mapping methods to the latent variables, and the predictors are specified via general R expressions, with separate expressions for each observation likelihood model in multi-likelihood models. A prediction method based on fast Monte Carlo sampling allows posterior prediction of general expressions of the latent variables. Ecology-focused introduction in Bachl, Lindgren, Borchers, and Illian (2019) <doi:10.1111/2041-210X.13168>.
Maintained by Finn Lindgren. Last updated 3 days ago.
5.6 match 96 stars 12.62 score 832 scripts 6 dependentsosmandag
AID:Box-Cox Power Transformation
Performs Box-Cox power transformation for different purposes, graphical approaches, assesses the success of the transformation via tests and plots, computes mean and confidence interval for back transformed data.
Maintained by Osman Dag. Last updated 26 days ago.
18.1 match 2 stars 3.90 score 66 scripts 1 dependentseliocamp
metR:Tools for Easier Analysis of Meteorological Fields
Many useful functions and extensions for dealing with meteorological data in the tidy data framework. Extends 'ggplot2' for better plotting of scalar and vector fields and provides commonly used analysis methods in the atmospheric sciences.
Maintained by Elio Campitelli. Last updated 21 days ago.
atmospheric-scienceggplot2visualization
5.8 match 144 stars 12.19 score 1000 scripts 22 dependentsrohelab
invertiforms:Invertible Transforms for Matrices
Provides composable invertible transforms for (sparse) matrices.
Maintained by Alex Hayes. Last updated 2 years ago.
22.2 match 1 stars 3.18 score 4 scripts 1 dependentsnatverse
nat:NeuroAnatomy Toolbox for Analysis of 3D Image Data
NeuroAnatomy Toolbox (nat) enables analysis and visualisation of 3D biological image data, especially traced neurons. Reads and writes 3D images in NRRD and 'Amira' AmiraMesh formats and reads surfaces in 'Amira' hxsurf format. Traced neurons can be imported from and written to SWC and 'Amira' LineSet and SkeletonGraph formats. These data can then be visualised in 3D via 'rgl', manipulated including applying calculated registrations, e.g. using the 'CMTK' registration suite, and analysed. There is also a simple representation for neurons that have been subjected to 3D skeletonisation but not formally traced; this allows morphological comparison between neurons including searches and clustering (via the 'nat.nblast' extension package).
Maintained by Gregory Jefferis. Last updated 5 months ago.
3dconnectomicsimage-analysisneuroanatomyneuroanatomy-toolboxneuronneuron-morphologyneurosciencevisualisation
7.0 match 67 stars 9.94 score 436 scripts 2 dependentsfabnavarro
gasper:Graph Signal Processing
Provides the standard operations for signal processing on graphs: graph Fourier transform, spectral graph wavelet transform, visualization tools. It also implements a data driven method for graph signal denoising/regression, for details see De Loynes, Navarro, Olivier (2019) <arxiv:1906.01882>. The package also provides an interface to the SuiteSparse Matrix Collection, <https://sparse.tamu.edu/>, a large and widely used set of sparse matrix benchmarks collected from a wide range of applications.
Maintained by Fabien Navarro. Last updated 7 months ago.
data-sciencegraphgraph-signal-processinggraph-waveletmachine-learningspectral-graph-theorystatisticssuitesparsewavelet-transformopenblascpp
17.2 match 8 stars 4.03 score 27 scriptserhard-lab
grandR:Comprehensive Analysis of Nucleotide Conversion Sequencing Data
Nucleotide conversion sequencing experiments have been developed to add a temporal dimension to RNA-seq and single-cell RNA-seq. Such experiments require specialized tools for primary processing such as GRAND-SLAM, (see 'Jürges et al' <doi:10.1093/bioinformatics/bty256>) and specialized tools for downstream analyses. 'grandR' provides a comprehensive toolbox for quality control, kinetic modeling, differential gene expression analysis and visualization of such data.
Maintained by Florian Erhard. Last updated 1 months ago.
9.8 match 11 stars 7.03 score 18 scripts 1 dependentsbioc
SpatialFeatureExperiment:Integrating SpatialExperiment with Simple Features in sf
A new S4 class integrating Simple Features with the R package sf to bring geospatial data analysis methods based on vector data to spatial transcriptomics. Also implements management of spatial neighborhood graphs and geometric operations. This pakage builds upon SpatialExperiment and SingleCellExperiment, hence methods for these parent classes can still be used.
Maintained by Lambda Moses. Last updated 1 months ago.
datarepresentationtranscriptomicsspatial
7.2 match 49 stars 9.40 score 322 scripts 1 dependentsjuba
questionr:Functions to Make Surveys Processing Easier
Set of functions to make the processing and analysis of surveys easier : interactive shiny apps and addins for data recoding, contingency tables, dataset metadata handling, and several convenience functions.
Maintained by Julien Barnier. Last updated 1 days ago.
5.3 match 83 stars 12.62 score 1.1k scripts 19 dependentsbioc
transformGamPoi:Variance Stabilizing Transformation for Gamma-Poisson Models
Variance-stabilizing transformations help with the analysis of heteroskedastic data (i.e., data where the variance is not constant, like count data). This package provide two types of variance stabilizing transformations: (1) methods based on the delta method (e.g., 'acosh', 'log(x+1)'), (2) model residual based (Pearson and randomized quantile residuals).
Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.
singlecellnormalizationpreprocessingregressioncpp
11.3 match 21 stars 5.95 score 21 scriptsbioc
zFPKM:A suite of functions to facilitate zFPKM transformations
Perform the zFPKM transform on RNA-seq FPKM data. This algorithm is based on the publication by Hart et al., 2013 (Pubmed ID 24215113). Reference recommends using zFPKM > -3 to select expressed genes. Validated with encode open/closed chromosome data. Works well for gene level data using FPKM or TPM. Does not appear to calibrate well for transcript level data.
Maintained by Ron Ammar. Last updated 5 months ago.
immunooncologyrnaseqfeatureextractionsoftwaregeneexpression
11.3 match 9 stars 5.94 score 16 scriptsfastverse
fastverse:A Suite of High-Performance Packages for Statistics and Data Manipulation
Easy installation, loading and management, of high-performance packages for statistical computing and data manipulation in R. The core 'fastverse' consists of 4 packages: 'data.table', 'collapse', 'kit' and 'magrittr', that jointly only depend on 'Rcpp'. The 'fastverse' can be freely and permanently extended with additional packages, both globally or for individual projects. Separate package verses can also be created. Fast packages for many common tasks such as time series, dates and times, strings, spatial data, statistics, data serialization, larger-than-memory processing, and compilation of R code are listed in the README file: <https://github.com/fastverse/fastverse#suggested-extensions>.
Maintained by Sebastian Krantz. Last updated 25 days ago.
ccppdata-aggregationdata-manipulationdata-sciencedata-transformationhigh-performancelow-dependencypanel-datastatistical-computingtime-seriesweights
7.5 match 264 stars 8.90 score 222 scriptsrstudio
gt:Easily Create Presentation-Ready Display Tables
Build display tables from tabular data with an easy-to-use set of functions. With its progressive approach, we can construct display tables with a cohesive set of table parts. Table values can be formatted using any of the included formatting functions. Footnotes and cell styles can be precisely added through a location targeting system. The way in which 'gt' handles things for you means that you don't often have to worry about the fine details.
Maintained by Richard Iannone. Last updated 11 days ago.
docxeasy-to-usehtmllatexrtfsummary-tables
3.6 match 2.1k stars 18.36 score 20k scripts 112 dependentsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 27 days ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
4.6 match 305 stars 14.45 score 1.6k scripts 6 dependentsbioc
MassSpecWavelet:Peak Detection for Mass Spectrometry data using wavelet-based algorithms
Peak Detection in Mass Spectrometry data is one of the important preprocessing steps. The performance of peak detection affects subsequent processes, including protein identification, profile alignment and biomarker identification. Using Continuous Wavelet Transform (CWT), this package provides a reliable algorithm for peak detection that does not require any type of smoothing or previous baseline correction method, providing more consistent results for different spectra. See <doi:10.1093/bioinformatics/btl355} for further details.
Maintained by Sergio Oller Moreno. Last updated 3 months ago.
immunooncologymassspectrometryproteomicspeakdetection
7.0 match 9 stars 9.38 score 37 scripts 17 dependentsnago2020
depCensoring:Statistical Methods for Survival Data with Dependent Censoring
Several statistical methods for analyzing survival data under various forms of dependent censoring are implemented in the package. In addition to accounting for dependent censoring, it offers tools to adjust for unmeasured confounding factors. The implemented approaches allow users to estimate the dependency between survival time and dependent censoring time, based solely on observed survival data. For more details on the methods, refer to Deresa and Van Keilegom (2021) <doi:10.1093/biomet/asaa095>, Czado and Van Keilegom (2023) <doi:10.1093/biomet/asac067>, Crommen et al. (2024) <doi:10.1007/s11749-023-00903-9>, Deresa and Van Keilegom (2024) <doi:10.1080/01621459.2022.2161387>, Rutten et al. (2024+) <doi:10.48550/arXiv.2403.11860> and Ding and Van Keilegom (2024).
Maintained by Negera Wakgari Deresa. Last updated 11 days ago.
23.4 match 2.78 score 5 scriptsbioc
microbiome:Microbiome Analytics
Utilities for microbiome analysis.
Maintained by Leo Lahti. Last updated 5 months ago.
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
5.2 match 290 stars 12.50 score 2.0k scripts 5 dependentsstan-dev
cmdstanr:R Interface to 'CmdStan'
A lightweight interface to 'Stan' <https://mc-stan.org>. The 'CmdStanR' interface is an alternative to 'RStan' that calls the command line interface for compilation and running algorithms instead of interfacing with C++ via 'Rcpp'. This has many benefits including always being compatible with the latest version of Stan, fewer installation errors, fewer unexpected crashes in RStudio, and a more permissive license.
Maintained by Andrew Johnson. Last updated 9 months ago.
bayesbayesianmarkov-chain-monte-carlomaximum-likelihoodmcmcstanvariational-inference
5.3 match 145 stars 12.27 score 5.2k scripts 9 dependentscytomining
cytominer:Methods for Image-Based Cell Profiling
`cytominer` is a suite of common functions used to process high-dimensional readouts from image-based cell profiling experiments.
Maintained by Shantanu Singh. Last updated 2 years ago.
9.4 match 50 stars 6.89 score 44 scriptspaul-buerkner
brms:Bayesian Regression Models using 'Stan'
Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.
Maintained by Paul-Christian Bürkner. Last updated 3 days ago.
bayesian-inferencebrmsmultilevel-modelsstanstatistical-models
3.9 match 1.3k stars 16.61 score 13k scripts 34 dependentsopendendro
dplR:Dendrochronology Program Library in R
Perform tree-ring analyses such as detrending, chronology building, and cross dating. Read and write standard file formats used in dendrochronology.
Maintained by Andy Bunn. Last updated 19 days ago.
5.5 match 39 stars 11.71 score 546 scripts 26 dependentswviechtb
metafor:Meta-Analysis Package for R
A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.
Maintained by Wolfgang Viechtbauer. Last updated 1 days ago.
meta-analysismixed-effectsmultilevel-modelsmultivariate
3.9 match 246 stars 16.30 score 4.9k scripts 92 dependentsfriendly
matlib:Matrix Functions for Teaching and Learning Linear Algebra and Multivariate Statistics
A collection of matrix functions for teaching and learning matrix linear algebra as used in multivariate statistical methods. Many of these functions are designed for tutorial purposes in learning matrix algebra ideas using R. In some cases, functions are provided for concepts available elsewhere in R, but where the function call or name is not obvious. In other cases, functions are provided to show or demonstrate an algorithm. In addition, a collection of functions are provided for drawing vector diagrams in 2D and 3D and for rendering matrix expressions and equations in LaTeX.
Maintained by Michael Friendly. Last updated 2 days ago.
diagramslinear-equationsmatrixmatrix-functionsmatrix-visualizervectorvignette
4.9 match 65 stars 12.89 score 900 scripts 11 dependentshrbrmstr
ggalt:Extra Coordinate Systems, 'Geoms', Statistical Transformations, Scales and Fonts for 'ggplot2'
A compendium of new geometries, coordinate systems, statistical transformations, scales and fonts for 'ggplot2', including splines, 1d and 2d densities, univariate average shifted histograms, a new map coordinate system based on the 'PROJ.4'-library along with geom_cartogram() that mimics the original functionality of geom_map(), formatters for "bytes", a stat_stepribbon() function, increased 'plotly' compatibility and the 'StateFace' open source font 'ProPublica'. Further new functionality includes lollipop charts, dumbbell charts, the ability to encircle points and coordinate-system-based text annotations.
Maintained by Bob Rudis. Last updated 2 years ago.
geomggplot-extensionggplot2ggplot2-geomggplot2-scales
5.0 match 674 stars 12.59 score 2.3k scripts 7 dependentspachadotdev
cpp11armadillo:An 'Armadillo' Interface
Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.
Maintained by Mauricio Vargas Sepulveda. Last updated 26 days ago.
armadillocppcpp11hacktoberfestlinear-algebra
6.8 match 9 stars 9.14 score 1 scripts 16 dependentswinvector
vtreat:A Statistically Sound 'data.frame' Processor/Conditioner
A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). Reference: "'vtreat': a data.frame Processor for Predictive Modeling", Zumel, Mount, 2016, <DOI:10.5281/zenodo.1173313>.
Maintained by John Mount. Last updated 2 months ago.
categorical-variablesmachine-learning-algorithmsnested-modelsprepare-data
5.5 match 285 stars 11.19 score 328 scripts 1 dependentseasystats
insight:Easy Access to Model Information for Various Model Objects
A tool to provide an easy, intuitive and consistent access to information contained in various R models, like model formulas, model terms, information about random effects, data that was used to fit the model or data from response variables. 'insight' mainly revolves around two types of functions: Functions that find (the names of) information, starting with 'find_', and functions that get the underlying data, starting with 'get_'. The package has a consistent syntax and works with many different model objects, where otherwise functions to access these information are missing.
Maintained by Daniel Lüdecke. Last updated 5 days ago.
easystatshacktoberfestinsightmodelsnamespredictorsrandom
3.6 match 412 stars 17.24 score 568 scripts 210 dependentscran
binhf:Haar-Fisz Functions for Binomial Data
Binomial Haar-Fisz transforms for Gaussianization as in Nunes and Nason (2009).
Maintained by Matt Nunes. Last updated 7 years ago.
16.1 match 3.85 score 3 dependentskingaa
pomp:Statistical Inference for Partially Observed Markov Processes
Tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.
Maintained by Aaron A. King. Last updated 1 months ago.
abcb-splinedifferential-equationsdynamical-systemsiterated-filteringlikelihoodlikelihood-freemarkov-chain-monte-carlomarkov-modelmathematical-modellingmeasurement-errorparticle-filtersequential-monte-carlosimulation-based-inferencesobol-sequencestate-spacestatistical-inferencestochastic-processestime-seriesopenblas
5.3 match 115 stars 11.81 score 1.3k scripts 4 dependentspecanproject
PEcAn.DB:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 2 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
5.2 match 216 stars 11.88 score 127 scripts 27 dependentsbioc
SummarizedExperiment:A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Maintained by Hervé Pagès. Last updated 5 months ago.
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
3.6 match 34 stars 16.85 score 8.6k scripts 1.2k dependentsspatstat
spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family
Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
Maintained by Adrian Baddeley. Last updated 1 months ago.
cluster-detectionconfidence-intervalshypothesis-testingk-functionroc-curvesscan-statisticssignificance-testingsimulation-envelopesspatial-analysisspatial-data-analysisspatial-sharpeningspatial-smoothingspatial-statistics
6.0 match 1 stars 10.17 score 67 scripts 148 dependentsdwarton
ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)
Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.
Maintained by David Warton. Last updated 1 years ago.
9.2 match 8 stars 6.58 score 53 scriptsr-spatial
stars:Spatiotemporal Arrays, Raster and Vector Data Cubes
Reading, manipulating, writing and plotting spatiotemporal arrays (raster and vector data cubes) in 'R', using 'GDAL' bindings provided by 'sf', and 'NetCDF' bindings by 'ncmeta' and 'RNetCDF'.
Maintained by Edzer Pebesma. Last updated 30 days ago.
3.3 match 571 stars 18.27 score 7.2k scripts 137 dependentsmatthewblackwell
Amelia:A Program for Missing Data
A tool that "multiply imputes" missing data in a single cross-section (such as a survey), from a time series (like variables collected for each year in a country), or from a time-series-cross-sectional data set (such as collected by years for each of several countries). Amelia II implements our bootstrapping-based algorithm that gives essentially the same answers as the standard IP or EMis approaches, is usually considerably faster than existing approaches and can handle many more variables. Unlike Amelia I and other statistically rigorous imputation software, it virtually never crashes (but please let us know if you find to the contrary!). The program also generalizes existing approaches by allowing for trends in time series across observations within a cross-sectional unit, as well as priors that allow experts to incorporate beliefs they have about the values of missing cells in their data. Amelia II also includes useful diagnostics of the fit of multiple imputation models. The program works from the R command line or via a graphical user interface that does not require users to know R.
Maintained by Matthew Blackwell. Last updated 4 months ago.
6.6 match 1 stars 9.06 score 1.4k scripts 7 dependentsserafinialessio
dformula:Data Manipulation using Formula
A tool for manipulating data using the generic formula. A single formula allows to easily add, replace and remove variables before running the analysis.
Maintained by Alessio Serafini. Last updated 8 months ago.
16.2 match 3.70 score 1 scriptsbioc
biovizBase:Basic graphic utilities for visualization of genomic data.
The biovizBase package is designed to provide a set of utilities, color schemes and conventions for genomic data. It serves as the base for various high-level packages for biological data visualization. This saves development effort and encourages consistency.
Maintained by Michael Lawrence. Last updated 5 months ago.
infrastructurevisualizationpreprocessing
7.4 match 8.04 score 273 scripts 75 dependents