franc:Detect the Language of Text
With no external dependencies and support for 335 languages; all languages spoken by more than one million speakers. 'Franc' is a port of the 'JavaScript' project of the same name, see <>.
Maintained by Gábor Csárdi. Last updated 3 years ago.
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurélie Siberchicot. Last updated 14 days ago.
Guerry:Maps, Data and Methods Related to Guerry (1833) "Moral Statistics of France"
Contains maps of France in 1830 and multivariate datasets from A.-M. Guerry and others. Statistical and graphic methods related to Guerry's "Moral Statistics of France" are used to understand Guerry's data and illustrate methods. The goal is to facilitate the exploration and development of statistical and graphic methods for multivariate data in a geospatial context of historical interest.
Maintained by Michael Friendly. Last updated 2 months ago.
charlatan:Make Fake Data
Make fake data that looks realistic, supporting addresses, person names, dates, times, colors, coordinates, currencies, digital object identifiers ('DOIs'), jobs, phone numbers, 'DNA' sequences, doubles and integers from distributions and within a range.
Maintained by Roel M. Hogervorst. Last updated 1 months ago.
VSURF:Variable Selection Using Random Forests
Three steps variable selection procedure based on random forests. Initially developed to handle high dimensional data (for which number of variables largely exceeds number of observations), the package is very versatile and can treat most dimensions of data, for regression and supervised classification problems. First step is dedicated to eliminate irrelevant variables from the dataset. Second step aims to select all variables related to the response for interpretation purpose. Third step refines the selection by eliminating redundancy in the set of variables selected by the second step, for prediction purpose. Genuer, R. Poggi, J.-M. and Tuleau-Malot, C. (2015) <>.
Maintained by Robin Genuer. Last updated 8 months ago.
adegenet:Exploratory Analysis of Genetic and Genomic Data
Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
Maintained by Zhian N. Kamvar. Last updated 1 months ago.
stacomiR:Fish Migration Monitoring
Graphical outputs and treatment for a database of fish pass monitoring. It is a part of the 'STACOMI' open source project developed in France by the French Office for Biodiversity institute to centralize data obtained by fish pass monitoring. This version is available in French and English. See <> for more information on 'STACOMI'.
Maintained by Cedric Briand. Last updated 1 years ago.
gnm:Generalized Nonlinear Models
Functions to specify and fit generalized nonlinear models, including models with multiplicative interaction terms such as the UNIDIFF model from sociology and the AMMI model from crop science, and many others. Over-parameterized representations of models are used throughout; functions are provided for inference on estimable parameter combinations, as well as standard methods for diagnostics etc.
Maintained by Heather Turner. Last updated 2 years ago.
maps:Draw Geographical Maps
Display of maps. Projection code and larger maps are in separate packages ('mapproj' and 'mapdata').
Maintained by Alex Deckmyn. Last updated 2 months ago.
adehabitatLT:Analysis of Animal Movements
A collection of tools for the analysis of animal movements.
Maintained by Clement Calenge. Last updated 6 months ago.
FactoMineR:Multivariate Exploratory Data Analysis and Data Mining
Exploratory data analysis methods to summarize, visualize and describe datasets. The main principal component methods are available, those with the largest potential in terms of applications: principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, Multiple Factor Analysis when variables are structured in groups, etc. and hierarchical cluster analysis. F. Husson, S. Le and J. Pages (2017).
Maintained by Francois Husson. Last updated 3 months ago.
historydata:Datasets for Historians
These sample data sets are intended for historians learning R. They include population, institutional, religious, military, and prosopographical data suitable for mapping, quantitative analysis, and network analysis.
Maintained by Lincoln Mullen. Last updated 7 months ago.
spData:Datasets for Spatial Analysis
Diverse spatial datasets for demonstrating, benchmarking and teaching spatial data analysis. It includes R data of class sf (defined by the package 'sf'), Spatial ('sp'), and nb ('spdep'). Unlike other spatial data packages such as 'rnaturalearth' and 'maps', it also contains data stored in a range of file formats including GeoJSON and GeoPackage, but from version 2.3.4, no longer ESRI Shapefile - use GeoPackage instead. Some of the datasets are designed to illustrate specific analysis techniques. cycle_hire() and cycle_hire_osm(), for example, is designed to illustrate point pattern analysis techniques.
Maintained by Jakub Nowosad. Last updated 2 months ago.
vchartr:Interactive Charts with the 'JavaScript' 'VChart' Library
Provides an 'htmlwidgets' interface to 'VChart.js'. 'VChart', more than just a cross-platform charting library, but also an expressive data storyteller. 'VChart' examples and documentation are available here: <>.
Maintained by Victor Perrier. Last updated 2 months ago.
covidregionaldata:Subnational Data for COVID-19 Epidemiology
An interface to subnational and national level COVID-19 data sourced from both official sources, such as Public Health England in the UK, and from other COVID-19 data collections, including the World Health Organisation (WHO), European Centre for Disease Prevention and Control (ECDC), John Hopkins University (JHU), Google Open Data and others. Designed to streamline COVID-19 data extraction, cleaning, and processing from a range of data sources in an open and transparent way. This allows users to inspect and scrutinise the data, and tools used to process it, at every step. For all countries supported, data includes a daily time-series of cases. Wherever available data is also provided for deaths, hospitalisations, and tests. National level data are also supported using a range of sources.
Maintained by Sam Abbott. Last updated 3 years ago.
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 29 days ago.
DemoKin:Estimate Population Kin Distribution
Estimate population kin counts and its distribution by type, age and sex. The package implements one-sex and two-sex framework for studying living-death availability, with time varying rates or not, and multi-stage model.
Maintained by Iván Williams. Last updated 19 days ago.
leaflet.minicharts:Mini Charts for Interactive Maps
Add and modify small charts on an interactive map created with package 'leaflet'. These charts can be used to represent at same time multiple variables on a single map.
Maintained by Veronique Bachelier. Last updated 4 years ago.
vcdExtra:'vcd' Extensions and Additions
Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.
Maintained by Michael Friendly. Last updated 5 months ago.
sfdep:Spatial Dependence for Simple Features
An interface to 'spdep' to integrate with 'sf' objects and the 'tidyverse'.
Maintained by Dexter Locke. Last updated 6 months ago.
weathermetrics:Functions to Convert Between Weather Metrics
Functions to convert between weather metrics, including conversions for metrics of temperature, air moisture, wind speed, and precipitation. This package also includes functions to calculate the heat index from air temperature and air moisture.
Maintained by Brooke Anderson. Last updated 8 years ago.
COVID19:COVID-19 Data Hub
Unified datasets for a better understanding of COVID-19.
Maintained by Emanuele Guidotti. Last updated 29 days ago.
RivRetrieve:Retrieve Global River Gauge Data
Provides access to global river gauge data from a variety of national-level river agencies. The package interfaces with the national-level agency websites to provide access to river gauge locations, river discharge, and river stage. Currently, the package is available for the following countries: Australia, Brazil, Canada, Chile, France, Japan, South Africa, the United Kingdom, and the United States.
Maintained by Ryan Riggs. Last updated 2 months ago.
EpiInvert:Variational Techniques in Epidemiology
Using variational techniques we address some epidemiological problems as the incidence curve decomposition by inverting the renewal equation as described in Alvarez et al. (2021) <doi:10.1073/pnas.2105112118> and Alvarez et al. (2022) <doi:10.3390/biology11040540> or the estimation of the functional relationship between epidemiological indicators. We also propose a learning method for the short time forecast of the trend incidence curve as described in Morel et al. (2022) <doi:10.1101/2022.11.05.22281904>.
Maintained by Luis Alvarez. Last updated 1 years ago.
HistData:Data Sets from the History of Statistics and Data Visualization
The 'HistData' package provides a collection of small data sets that are interesting and important in the history of statistics and data visualization. The goal of the package is to make these available, both for instructional use and for historical research. Some of these present interesting challenges for graphics or analysis in R.
Maintained by Michael Friendly. Last updated 10 months ago.
mapping:Automatic Download, Linking, Manipulating Coordinates for Maps
Maps are an important tool to visualise variables distribution across different spatial objects. The mapping process requires to link the data with coordinates and then generate the correspondent map. This package provide coordinates, linking and mapping functions for an automatic, flexible and easy approach of external functions. The package provides an easy, flexible and automatic unit. Geographical coordinates are provided in the package and automatically linked with the input data to generate maps with internal provided functions or external functions. Provide an easy, flexible and automatic approach to potentially download updated coordinates, to link statistical units with coordinates and to aggregate variables based on the spatial hierarchy of units. The object returned from the package can be used for thematic maps with the build-in functions provided in mapping or with other packages already available.
Maintained by Alessio Serafini. Last updated 1 years ago.
smacof:Multidimensional Scaling
Implements the following approaches for multidimensional scaling (MDS) based on stress minimization using majorization (smacof): ratio/interval/ordinal/spline MDS on symmetric dissimilarity matrices, MDS with external constraints on the configuration, individual differences scaling (idioscal, indscal), MDS with spherical restrictions, and ratio/interval/ordinal/spline unfolding (circular restrictions, row-conditional). Various tools and extensions like jackknife MDS, bootstrap MDS, permutation tests, MDS biplots, gravity models, unidimensional scaling, drift vectors (asymmetric MDS), classical scaling, and Procrustes are implemented as well.
Maintained by Patrick Mair. Last updated 5 months ago.
fastText:Efficient Learning of Word Representations and Sentence Classification
An interface to the 'fastText' <> library for efficient learning of word representations and sentence classification. The 'fastText' algorithm is explained in detail in (i) "Enriching Word Vectors with subword Information", Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov, 2017, <doi:10.1162/tacl_a_00051>; (ii) "Bag of Tricks for Efficient Text Classification", Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov, 2017, <doi:10.18653/v1/e17-2068>; (iii) " Compressing text classification models", Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Herve Jegou, Tomas Mikolov, 2016, <arXiv:1612.03651>.
Maintained by Lampros Mouselimis. Last updated 1 years ago.
fakir:Generate Fake Datasets for Prototyping and Teaching
Create fake datasets that can be used for prototyping and teaching. This package provides a set of functions to generate fake data for a variety of data types, such as dates, addresses, and names. It can be used for prototyping (notably in 'shiny') or as a tool to teach data manipulation and data visualization.
Maintained by Colin Fay. Last updated 7 months ago.
Ecdat:Data Sets for Econometrics
Data sets for econometrics, including political science.
Maintained by Spencer Graves. Last updated 4 months ago.
metsyn:Interface with the Meteo France Synop Data API
Provides an interface with the Meteo France Synop data API (see <> for more information). The Meteo France Synop data are made of meteorological data recorded every three hours on 62 French meteorological stations.
Maintained by Paul Poncet. Last updated 6 years ago.
QRM:Provides R-Language Code to Examine Quantitative Risk Management Concepts
Provides functions/methods to accompany the book Quantitative Risk Management: Concepts, Techniques and Tools by Alexander J. McNeil, Ruediger Frey, and Paul Embrechts.
Maintained by Bernhard Pfaff. Last updated 5 years ago.
alboFr:Get French Data on Tiger Mosquito Colonisation
Get French Data on Tiger Mosquito (Aedes Albopictus) colonisation in France from the online map at <>.
Maintained by Egor Kotov. Last updated 22 days ago.
jSDM:Joint Species Distribution Models
Fits joint species distribution models ('jSDM') in a hierarchical Bayesian framework (Warton and al. 2015 <doi:10.1016/j.tree.2015.09.007>). The Gibbs sampler is written in 'C++'. It uses 'Rcpp', 'Armadillo' and 'GSL' to maximize computation efficiency.
Maintained by Ghislain Vieilledent. Last updated 2 years ago.
geocmeans:Implementing Methods for Spatial Fuzzy Unsupervised Classification
Provides functions to apply spatial fuzzy unsupervised classification, visualize and interpret results. This method is well suited when the user wants to analyze data with a fuzzy clustering algorithm and to account for the spatial dimension of the dataset. In addition, indexes for estimating the spatial consistency and classification quality are proposed. The methods were originally proposed in the field of brain imagery (seed Cai and al. 2007 <doi:10.1016/j.patcog.2006.07.011> and Zaho and al. 2013 <doi:10.1016/j.dsp.2012.09.016>) and recently applied in geography (see Gelb and Apparicio <doi:10.4000/cybergeo.36414>).
Maintained by Jeremy Gelb. Last updated 4 months ago.
FSAdata:Data to Support Fish Stock Assessment ('FSA') Package
The datasets to support the Fish Stock Assessment ('FSA') package.
Maintained by Derek Ogle. Last updated 2 years ago.
lidaRtRee:Forest Analysis with Airborne Laser Scanning (LiDAR) Data
Provides functions for forest objects detection, structure metrics computation, model calibration and mapping with airborne laser scanning: co-registration of field plots (Monnet and Mermin (2014) <doi:10.3390/f5092307>); tree detection (method 1 in Eysn et al. (2015) <doi:10.3390/f6051721>) and segmentation; forest parameters estimation with the area-based approach: model calibration with ground reference, and maps export (Aussenac et al. (2023) <doi:10.12688/openreseurope.15373.2>); extraction of both physical (gaps, edges, trees) and statistical features useful for e.g. habitat suitability modeling (Glad et al. (2020) <doi:10.1002/rse2.117>) and forest maturity mapping (Fuhr et al. (2022) <doi:10.1002/rse2.274>).
Maintained by Jean-Matthieu Monnet. Last updated 2 months ago.
DepthProc:Statistical Depth Functions for Multivariate Analysis
Data depth concept offers a variety of powerful and user friendly tools for robust exploration and inference for multivariate data. The offered techniques may be successfully used in cases of lack of our knowledge on parametric models generating data due to their nature. The package consist of among others implementations of several data depth techniques involving multivariate quantile-quantile plots, multivariate scatter estimators, multivariate Wilcoxon tests and robust regressions.
Maintained by Zygmunt Zawadzki. Last updated 3 years ago.
starma:Modelling Space Time AutoRegressive Moving Average (STARMA) Processes
Statistical functions to identify, estimate and diagnose a Space-Time AutoRegressive Moving Average (STARMA) model.
Maintained by Felix Cheysson. Last updated 4 years ago.
ads:Spatial Point Patterns Analysis
Perform first- and second-order multi-scale analyses derived from Ripley K-function (Ripley B. D. (1977) <doi:10.1111/j.2517-6161.1977.tb01615.x>), for univariate, multivariate and marked mapped data in rectangular, circular or irregular shaped sampling windows, with tests of statistical significance based on Monte Carlo simulations.
Maintained by Dominique Lamonica. Last updated 1 years ago.
htsr:Hydro-Meteorology Time-Series
Functions for the management and treatment of hydrology and meteorology time-series stored in a 'Sqlite' data base.
Maintained by Pierre Chevallier. Last updated 7 months ago.
vigicaen:'VigiBase' Pharmacovigilance Database Toolbox
Perform the analysis of the World Health Organization (WHO) Pharmacovigilance database 'VigiBase' (Extract Case Level version), <> e.g., load data, perform data management, disproportionality analysis, and descriptive statistics. Intended for pharmacovigilance routine use or studies. This package is NOT supported nor reflect the opinion of the WHO, or the Uppsala Monitoring Centre. Disproportionality methods are described by Norén et al (2013) <doi:10.1177/0962280211403604>.
Maintained by Charles Dolladille. Last updated 4 days ago.
LIM:Linear Inverse Model Examples and Solution Methods
Functions that read and solve linear inverse problems (food web problems, linear programming problems). These problems find solutions to linear or quadratic functions: min or max (f(x)), where f(x) = ||Ax-b||^2 or f(x) = sum(ai*xi) subject to equality constraints Ex=f and inequality constraints Gx>=h.
Maintained by Karline Soetaert. Last updated 1 years ago.
cartogramR:Continuous Cartogram
Procedures for making continuous cartogram. Procedures available are: flow based cartogram (Gastner & Newman (2004) <doi:10.1073/pnas.0400280101>), fast flow based cartogram (Gastner, Seguy & More (2018) <doi:10.1073/pnas.1712674115>), rubber band based cartogram (Dougenik et al. (1985) <doi:10.1111/j.0033-0124.1985.00075.x>).
Maintained by Pierre-Andre Cornillon. Last updated 7 months ago.
mgwrsar:GWR, Mixed GWR and Multiscale GWR with Spatial Autocorrelation
Functions for computing (Mixed and Multiscale) Geographically Weighted Regression with spatial autocorrelation, Geniaux and Martinetti (2017) <doi:10.1016/j.regsciurbeco.2017.04.001>.
Maintained by Ghislain Geniaux. Last updated 26 days ago.
DAIME:Effects of Changing Deposition Rates
Reverse and model the effects of changing deposition rates on geological data and rates. Based on Hohmann (2018) <doi:10.13140/RG.2.2.23372.51841> .
Maintained by Niklas Hohmann. Last updated 5 years ago.
mredgebuildings:Prepare data to be used by the EDGE-Buildings model
Prepare data to be used by the EDGE-Buildings model.
Maintained by Robin Hasse. Last updated 3 days ago.
GmooG:Datasets for the Book 'Getting (more out of) Graphics'
Datasets analysed in the book Antony Unwin (2024, ISBN:978-0367674007) "Getting (more out of) Graphics".
Maintained by Antony Unwin. Last updated 7 months ago.
Rarity:Calculation of Rarity Indices for Species and Assemblages of Species
Allows calculation of rarity weights for species and indices of rarity for assemblages of species according to different methods (Leroy et al. 2012, Insect. Conserv. Divers. 5:159-168 <doi:10.1111/j.1752-4598.2011.00148.x>; Leroy et al. 2013, Divers. Distrib. 19:794-803 <doi:10.1111/ddi.12040>).
Maintained by Boris Leroy. Last updated 2 years ago.
ZeBook:Working with Dynamic Models for Agriculture and Environment
R package accompanying the book Working with dynamic models for agriculture and environment, by Daniel Wallach (INRA), David Makowski (INRA), James W. Jones (U.of Florida), Francois Brun (ACTA). 3rd edition 2018-09-27.
Maintained by Francois Brun. Last updated 6 years ago.
robustarima:Robust ARIMA Modeling
Functions for fitting a linear regression model with ARIMA errors using a filtered tau-estimate. The methodology is described in Maronna et al (2017, ISBN:9781119214687).
Maintained by Stephen Kaluzny. Last updated 6 months ago.
tilegramsR:R Spatial Data for Tilegrams
R spatial objects for Tilegrams. Tilegrams are tiled maps where the region size is proportional to the certain characteristics of the dataset.
Maintained by Bhaskar Karambelkar. Last updated 3 years ago.
ExtremalDep:Extremal Dependence Models
A set of procedures for parametric and non-parametric modelling of the dependence structure of multivariate extreme-values is provided. The statistical inference is performed with non-parametric estimators, likelihood-based estimators and Bayesian techniques. It adapts the methodologies of Beranger and Padoan (2015) <doi:10.48550/arXiv.1508.05561>, Marcon et al. (2016) <doi:10.1214/16-EJS1162>, Marcon et al. (2017) <doi:10.1002/sta4.145>, Marcon et al. (2017) <doi:10.1016/j.jspi.2016.10.004> and Beranger et al. (2021) <doi:10.1007/s10687-019-00364-0>. This package also allows for the modelling of spatial extremes using flexible max-stable processes. It provides simulation algorithms and fitting procedures relying on the Stephenson-Tawn likelihood as per Beranger at al. (2021) <doi:10.1007/s10687-020-00376-1>.
Maintained by Simone Padoan. Last updated 3 months ago.
archdata:Example Datasets from Archaeological Research
The archdata package provides several types of data that are typically used in archaeological research. It provides all of the data sets used in "Quantitative Methods in Archaeology Using R" by David L Carlson, one of the Cambridge Manuals in Archaeology.
Maintained by David L. Carlson. Last updated 4 years ago.
gdpc:Generalized Dynamic Principal Components
Functions to compute the Generalized Dynamic Principal Components introduced in Peña and Yohai (2016) <DOI:10.1080/01621459.2015.1072542>. The implementation includes an automatic procedure proposed in Peña, Smucler and Yohai (2020) <DOI:10.18637/jss.v092.c02> for the identification of both the number of lags to be used in the generalized dynamic principal components as well as the number of components required for a given reconstruction accuracy.
Maintained by Ezequiel Smucler. Last updated 1 years ago.
extremefit:Estimation of Extreme Conditional Quantiles and Probabilities
Extreme value theory, nonparametric kernel estimation, tail conditional probabilities, extreme conditional quantile, adaptive estimation, quantile regression, survival probabilities.
Maintained by Kevin Jaunatre. Last updated 6 years ago.
oceanis:Cartography for Statistical Analysis
Creating maps for statistical analysis such as proportional circles, choropleth, typology and flows. Some functions use 'shiny' or 'leaflet' technologies for dynamism and interactivity. The great features are : - Create maps in a web environment where the parameters are modifiable on the fly ('shiny' and 'leaflet' technologies). - Create interactive maps through zoom and pop-up ('leaflet' technology). - Create frozen maps with the possibility to add labels.
Maintained by Sébastien Novella. Last updated 2 months ago.
espadon:Easy Study of Patient DICOM Data in Oncology
Exploitation, processing and 2D-3D visualization of DICOM-RT files (structures, dosimetry, imagery) for medical physics and clinical research, in a patient-oriented perspective.
Maintained by Cathy Fontbonne. Last updated 1 months ago.
tgver:Turing Geovisualization Engine R package
Turing Geovisualization Engine R package for geospatial visualization and analysis.
Maintained by Layik Hama. Last updated 2 years ago.
GenAlgo:Classes and Methods to Use Genetic Algorithms for Feature Selection
Defines classes and methods that can be used to implement genetic algorithms for feature selection. The idea is that we want to select a fixed number of features to combine into a linear classifier that can predict a binary outcome, and can use a genetic algorithm heuristically to select an optimal set of features.
Maintained by Kevin R. Coombes. Last updated 4 years ago.
Elja:Linear, Logistic and Generalized Linear Models Regressions for the EnvWAS/EWAS Approach
Tool for Environment-Wide Association Studies (EnvWAS / EWAS) which are repeated analysis. It includes three functions. One function for linear regression, a second for logistic regression and a last one for generalized linear models.
Maintained by Marwan El Homsi. Last updated 2 years ago.
airGRiwrm:'airGR' Integrated Water Resource Management
Semi-distributed Precipitation-Runoff Modeling based on 'airGR' package models integrating human infrastructures and their managements.
Maintained by David Dorchies. Last updated 6 months ago.
gmwmx2:Estimate Functional and Stochastic Parameters of Linear Models with Correlated Residuals and Missing Data
Implements the Generalized Method of Wavelet Moments with Exogenous Inputs estimator (GMWMX) presented in Voirol, L., Xu, H., Zhang, Y., Insolia, L., Molinari, R. and Guerrier, S. (2024) <doi:10.48550/arXiv.2409.05160>. The GMWMX estimator allows to estimate functional and stochastic parameters of linear models with correlated residuals in presence of missing data. The 'gmwmx2' package provides functions to load and plot Global Navigation Satellite System (GNSS) data from the Nevada Geodetic Laboratory and functions to estimate linear model model with correlated residuals in presence of missing data.
Maintained by Lionel Voirol. Last updated 3 days ago.
GeomArchetypal:Finds the Geometrical Archetypal Analysis of a Data Frame
Performs Geometrical Archetypal Analysis after creating Grid Archetypes which are the Cartesian Product of all minimum, maximum variable values. Since the archetypes are fixed now, we have the ability to compute the convex composition coefficients for all our available data points much faster by using the half part of Principal Convex Hull Archetypal method. Additionally we can decide to keep as archetypes the closer to the Grid Archetypes ones. Finally the number of archetypes is always 2 to the power of the dimension of our data points if we consider them as a vector space. Cutler, A., Breiman, L. (1994) <doi:10.1080/00401706.1994.10485840>. Morup, M., Hansen, LK. (2012) <doi:10.1016/j.neucom.2011.06.033>. Christopoulos, DT. (2024) <doi:10.13140/RG.2.2.14030.88642>.
Maintained by Demetris Christopoulos. Last updated 10 months ago.
covid19france:Cases of COVID-19 in France
Imports and cleans 'opencovid19-fr' <> data on COVID-19 in France.
Maintained by Amanda Dobbyn. Last updated 5 years ago.
npde:Normalised Prediction Distribution Errors for Nonlinear Mixed-Effect Models
Provides routines to compute normalised prediction distribution errors, a metric designed to evaluate non-linear mixed effect models such as those used in pharmacokinetics and pharmacodynamics.
Maintained by Emmanuelle Comets. Last updated 1 years ago.
FADA:Variable Selection for Supervised Classification in High Dimension
The functions provided in the FADA (Factor Adjusted Discriminant Analysis) package aim at performing supervised classification of high-dimensional and correlated profiles. The procedure combines a decorrelation step based on a factor modeling of the dependence among covariates and a classification method. The available methods are Lasso regularized logistic model (see Friedman et al. (2010)), sparse linear discriminant analysis (see Clemmensen et al. (2011)), shrinkage linear and diagonal discriminant analysis (see M. Ahdesmaki et al. (2010)). More methods of classification can be used on the decorrelated data provided by the package FADA.
Maintained by David Causeur. Last updated 5 years ago.
GeoFIS:Spatial Data Processing for Decision Making
Methods for processing spatial data for decision-making. This package is an R implementation of methods provided by the open source software GeoFIS <> (Leroux et al. 2018) <doi:10.3390/agriculture8060073>. The main functionalities are the management zone delineation (Pedroso et al. 2010) <doi:10.1016/j.compag.2009.10.007> and data aggregation (Mora-Herrera et al. 2020) <doi:10.1016/j.compag.2020.105624>.
Maintained by Jean-Luc Lablée. Last updated 3 months ago.
archetypal:Finds the Archetypal Analysis of a Data Frame
Performs archetypal analysis by using Principal Convex Hull Analysis under a full control of all algorithmic parameters. It contains a set of functions for determining the initial solution, the optimal algorithmic parameters and the optimal number of archetypes. Post run tools are also available for the assessment of the derived solution. Morup, M., Hansen, LK (2012) <doi:10.1016/j.neucom.2011.06.033>. Hochbaum, DS, Shmoys, DB (1985) <doi:10.1287/moor.10.2.180>. Eddy, WF (1977) <doi:10.1145/355759.355768>. Barber, CB, Dobkin, DP, Huhdanpaa, HT (1996) <doi:10.1145/235815.235821>. Christopoulos, DT (2016) <doi:10.2139/ssrn.3043076>. Falk, A., Becker, A., Dohmen, T., Enke, B., Huffman, D., Sunde, U. (2018), <doi:10.1093/qje/qjy013>. Christopoulos, DT (2015) <doi:10.1016/j.jastp.2015.03.009> . Murari, A., Peluso, E., Cianfrani, Gaudio, F., Lungaroni, M., (2019), <doi:10.3390/e21040394>.
Maintained by Demetris Christopoulos. Last updated 10 months ago.
PFIM:Population Fisher Information Matrix
Evaluate or optimize designs for nonlinear mixed effects models using the Fisher Information matrix. Methods used in the package refer to Mentré F, Mallet A, Baccar D (1997) <doi:10.1093/biomet/84.2.429>, Retout S, Comets E, Samson A, Mentré F (2007) <doi:10.1002/sim.2910>, Bazzoli C, Retout S, Mentré F (2009) <doi:10.1002/sim.3573>, Le Nagard H, Chao L, Tenaillon O (2011) <doi:10.1186/1471-2148-11-326>, Combes FP, Retout S, Frey N, Mentré F (2013) <doi:10.1007/s11095-013-1079-3> and Seurat J, Tang Y, Mentré F, Nguyen TT (2021) <doi:10.1016/j.cmpb.2021.106126>.
Maintained by Romain Leroux. Last updated 5 months ago.
SobolSequence:Sobol Sequences with Better Two-Dimensional Projections
R implementation of S. Joe and F. Y. Kuo(2008) <DOI:10.1137/070709359>. The implementation is based on the data file new-joe-kuo-6.21201 <>.
Maintained by Mutsuo Saito. Last updated 8 years ago.
FSelectorRcpp:'Rcpp' Implementation of 'FSelector' Entropy-Based Feature Selection Algorithms with a Sparse Matrix Support
'Rcpp' (free of 'Java'/'Weka') implementation of 'FSelector' entropy-based feature selection algorithms based on an MDL discretization (Fayyad U. M., Irani K. B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In 13'th International Joint Conference on Uncertainly in Artificial Intelligence (IJCAI93), pages 1022-1029, Chambery, France, 1993.) <> with a sparse matrix support.
Maintained by Zygmunt Zawadzki. Last updated 6 months ago.
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{}.
Maintained by Kieran Healy. Last updated 11 months ago.
FisPro:Fuzzy Inference System Design and Optimization
Fuzzy inference systems are based on fuzzy rules, which have a good capability for managing progressive phenomenons. This package is a basic implementation of the main functions to use a Fuzzy Inference System (FIS) provided by the open source software 'FisPro' <>. 'FisPro' allows to create fuzzy inference systems and to use them for reasoning purposes, especially for simulating a physical or biological system.
Maintained by Jean-Luc Lablée. Last updated 2 years ago.
deltaPlotR:Identification of Dichotomous Differential Item Functioning (DIF) using Angoff's Delta Plot Method
The deltaPlotR package implements Angoff's Delta Plot method to detect dichotomous DIF. Several detection thresholds are included, either from multivariate normality assumption or by prior determination. Item purification is supported (Magis and Facon (2014) <doi:10.18637/jss.v059.c01>).
Maintained by David Magis. Last updated 7 years ago.
pandemics:Monitoring a Developing Pandemic with Available Data
Full dynamic system to describe and forecast the spread and the severity of a developing pandemic, based on available data. These data are number of infections, hospitalizations, deaths and recoveries notified each day. The system consists of three transitions, infection-infection, infection-hospital and hospital-death/recovery. The intensities of these transitions are dynamic and estimated using non-parametric local linear estimators. The package can be used to provide forecasts and survival indicators such as the median time spent in hospital and the probability that a patient who has been in hospital for a number of days can leave it alive. Methods are described in Gámiz, Mammen, Martínez-Miranda, and Nielsen (2024) <doi:10.48550/arXiv.2308.09918> and <doi:10.48550/arXiv.2308.09919>.
Maintained by María Dolores Martínez-Miranda. Last updated 6 months ago.
Rcriticor:Pierre-Goldwin Correlogram
Goldwin-Pierre correlogram. Research of critical periods in the past. Integrates a time series in a given window.
Maintained by J.S. Pierre. Last updated 7 years ago.
cmfrec:Collective Matrix Factorization for Recommender Systems
Collective matrix factorization (a.k.a. multi-view or multi-way factorization, Singh, Gordon, (2008) <doi:10.1145/1401890.1401969>) tries to approximate a (potentially very sparse or having many missing values) matrix 'X' as the product of two low-dimensional matrices, optionally aided with secondary information matrices about rows and/or columns of 'X', which are also factorized using the same latent components. The intended usage is for recommender systems, dimensionality reduction, and missing value imputation. Implements extensions of the original model (Cortes, (2018) <arXiv:1809.00366>) and can produce different factorizations such as the weighted 'implicit-feedback' model (Hu, Koren, Volinsky, (2008) <doi:10.1109/ICDM.2008.22>), the 'weighted-lambda-regularization' model, (Zhou, Wilkinson, Schreiber, Pan, (2008) <doi:10.1007/978-3-540-68880-8_32>), or the enhanced model with 'implicit features' (Rendle, Zhang, Koren, (2019) <arXiv:1905.01395>), with or without side information. Can use gradient-based procedures or alternating-least squares procedures (Koren, Bell, Volinsky, (2009) <doi:10.1109/MC.2009.263>), with either a Cholesky solver, a faster conjugate gradient solver (Takacs, Pilaszy, Tikk, (2011) <doi:10.1145/2043932.2043987>), or a non-negative coordinate descent solver (Franc, Hlavac, Navara, (2005) <doi:10.1007/11556121_50>), providing efficient methods for sparse and dense data, and mixtures thereof. Supports L1 and L2 regularization in the main models, offers alternative most-popular and content-based models, and implements functionality for cold-start recommendations and imputation of 2D data.
Maintained by David Cortes. Last updated 2 months ago.
airGR:Suite of GR Hydrological Models for Precipitation-Runoff Modelling
Hydrological modelling tools developed at INRAE-Antony (HYCAR Research Unit, France). The package includes several conceptual rainfall-runoff models (GR4H, GR5H, GR4J, GR5J, GR6J, GR2M, GR1A) that can be applied either on a lumped or semi-distributed way. A snow accumulation and melt model (CemaNeige) and the associated functions for the calibration and evaluation of models are also included. Use help(airGR) for package description and references.
Maintained by Olivier Delaigue. Last updated 1 years ago.
extremeIndex:Forecast Verification for Extreme Events
An index measuring the amount of information brought by forecasts for extreme events, subject to calibration, is computed. This index is originally designed for weather or climate forecasts, but it may be used in other forecasting contexts. This is the implementation of the index in Taillardat et al. (2019) <arXiv:1905.04022>.
Maintained by Maxime Taillardat. Last updated 3 years ago.
MIXFIM:Evaluation of the FIM in NLMEMs using MCMC
Evaluation and optimization of the Fisher Information Matrix in NonLinear Mixed Effect Models using Markov Chains Monte Carlo for continuous and discrete data.
Maintained by Marie-Karelle Riviere-Jourdan. Last updated 6 years ago.
bnfimage:'BnF Image API' Client
Provides an R client for the image API of 'Bibliothèque Nationale de France' (BnF, National Library of France) <>.
Maintained by Matthias Grenié. Last updated 4 years ago.
rwebstat:Download Data from the Webstat API
Access the Webstat API, download data and metadata from more than 35000 time series from the Banque de France statistics web portal. Access requires a free client ID easily available from the API portal <>.
Maintained by Vincent Guegan. Last updated 2 years ago.
DataSetsUni:A Collection of Univariate Data Sets
A collection of widely used univariate data sets of various applied domains on applications of distribution theory. The functions allow researchers and practitioners to quickly, easily, and efficiently access and use these data sets. The data are related to different applied domains and as follows: Bio-medical, survival analysis, medicine, reliability analysis, hydrology, actuarial science, operational research, meteorology, extreme values, quality control, engineering, finance, sports and economics. The total 100 data sets are documented along with associated references for further details and uses.
Maintained by Muhammad Imran. Last updated 2 years ago.
EIEntropy:Ecological Inference Applying Entropy
Implements two estimations related to the foundations of info metrics applied to ecological inference. These methodologies assess the lack of disaggregated data and provide an approach to obtaining disaggregated territorial-level data. For more details, see the following references: Fernández-Vázquez, E., Díaz-Dapena, A., Rubiera-Morollón, F. et al. (2020) "Spatial Disaggregation of Social Indicators: An Info-Metrics Approach." <doi:10.1007/s11205-020-02455-z>. Díaz-Dapena, A., Fernández-Vázquez, E., Rubiera-Morollón, F., & Vinuela, A. (2021) "Mapping poverty at the local level in Europe: A consistent spatial disaggregation of the AROPE indicator for France, Spain, Portugal and the United Kingdom." <doi:10.1111/rsp3.12379>.
Maintained by Silvia María Franco Anaya. Last updated 4 months ago.
NutrienTrackeR:Food Composition Information and Dietary Assessment
Provides a tool set for food information and dietary assessment. It uses food composition data from several reference databases, including: 'USDA' (United States), 'CIQUAL' (France), 'BEDCA' (Spain), 'CNF' (Canada) and 'STFCJ' (Japan). 'NutrienTrackeR' calculates the intake levels for both macronutrient and micronutrients, and compares them with the recommended dietary allowances (RDA). It includes a number of visualization tools, such as time series plots of nutrient intake, and pie-charts showing the main foods contributing to the intake level of a given nutrient. A shiny app exposing the main functionalities of the package is also provided.
Maintained by Rafael Ayala. Last updated 2 years ago.
EUfootball:Football Match Data of European Leagues
Contains match results from seven European men's football leagues, namely Premier League (England), Ligue 1 (France), Bundesliga (Germany), Serie A (Italy), Primera Division (Spain), Eredivisie (The Netherlands), Super Lig (Turkey). Includes Seasons 2010/2011 until 2019/2020 and a set of interesting covariates. Can be used all purposes.
Maintained by Hendrik van der Wurp. Last updated 3 years ago.
TaxicabCA:Taxicab Correspondence Analysis
Computation and visualization of Taxicab Correspondence Analysis, Choulakian (2006) <doi:10.1007/s11336-004-1231-4>. Classical correspondence analysis (CA) is a statistical method to analyse 2-dimensional tables of positive numbers and is typically applied to contingency tables (Benzecri, J.-P. (1973). L'Analyse des Donnees. Volume II. L'Analyse des Correspondances. Paris, France: Dunod). Classical CA is based on the Euclidean distance. Taxicab CA is like classical CA but is based on the Taxicab or Manhattan distance. For some tables, Taxicab CA gives more informative results than classical CA.
Maintained by Jacques Allard. Last updated 5 years ago.
