Showing 83 of total 83 results (show query)
lightbluetitan
crimedatasets:A Comprehensive Collection of Crime-Related Datasets
A comprehensive collection of datasets exclusively focused on crimes, criminal activities, and related topics. This package serves as a valuable resource for researchers, analysts, and students interested in crime analysis, criminology, social and economic studies related to criminal behavior. Datasets span global and local contexts, with a mix of tabular and spatial data.
Maintained by Renzo Caceres Rossi. Last updated 4 months ago.
45.4 match 8 stars 4.90 score 3 scriptsmpjashby
crimedata:Access Crime Data from the Open Crime Database
Gives convenient access to publicly available police-recorded open crime data from large cities in the United States that are included in the Crime Open Database <https://osf.io/zyaqn/>.
Maintained by Matthew Ashby. Last updated 1 years ago.
40.1 match 26 stars 5.51 score 25 scriptsevanodell
ukpolice:Download Data on UK Police and Crime
Downloads data from the 'UK Police' public data API, the full docs of which are available at <https://data.police.uk/docs/>. Includes data on police forces and police force areas, crime reports, and the use of stop-and-search powers.
Maintained by Evan Odell. Last updated 4 years ago.
api-clientcrimepolicepolice-apiuk
30.4 match 7 stars 5.16 score 41 scriptsalanarnholt
BSDA:Basic Statistics and Data Analysis
Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.
Maintained by Alan T. Arnholt. Last updated 2 years ago.
14.1 match 7 stars 9.11 score 1.3k scripts 6 dependentsr-forge
carData:Companion to Applied Regression Data Sets
Datasets to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage (2019).
Maintained by John Fox. Last updated 5 months ago.
7.1 match 12.41 score 944 scripts 919 dependentsjsspaulding
rcrimeanalysis:An Implementation of Crime Analysis Methods
An implementation of functions for the analysis of crime incident or records management system data. The package implements analysis algorithms scaled for city or regional crime analysis units. The package provides functions for kernel density estimation for crime heat maps, geocoding using the 'Google Maps' API, identification of repeat crime incidents, spatio-temporal map comparison across time intervals, time series analysis (forecasting and decomposition), detection of optimal parameters for the identification of near repeat incidents, and near repeat analysis with crime network linkage.
Maintained by Jamie Spaulding. Last updated 2 years ago.
17.3 match 5 stars 4.40 score 5 scriptskjhealy
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.
Maintained by Kieran Healy. Last updated 11 months ago.
30.4 match 2.28 score 38 scriptscran
mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation
Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.
Maintained by Simon Wood. Last updated 1 years ago.
5.3 match 32 stars 12.71 score 17k scripts 7.8k dependentsdkahle
ggmap:Spatial Visualization with ggplot2
A collection of functions to visualize spatial data and models on top of static maps from various online sources (e.g Google Maps and Stamen Maps). It includes tools common to those tasks, including functions for geolocation and routing.
Maintained by David Kahle. Last updated 1 years ago.
4.5 match 770 stars 14.17 score 12k scripts 31 dependentslbb220
GISTools:Further Capabilities in Geographic Information Science
Mapping and spatial data manipulation tools - in particular drawing thematic maps with nice looking legends, and aggregation of point data to polygons.
Maintained by Binbin Lu. Last updated 6 months ago.
15.4 match 4.07 score 584 scriptslightbluetitan
usdatasets:A Comprehensive Collection of U.S. Datasets
Provides a diverse collection of U.S. datasets encompassing various fields such as crime, economics, education, finance, energy, healthcare, and more. It serves as a valuable resource for researchers and analysts seeking to perform in-depth analyses and derive insights from U.S.-specific data.
Maintained by Renzo Caceres Rossi. Last updated 5 months ago.
8.1 match 7 stars 5.99 score 141 scriptsspatstat
spatstat.data:Datasets for 'spatstat' Family
Contains all the datasets for the 'spatstat' family of packages.
Maintained by Adrian Baddeley. Last updated 18 hours ago.
kernel-densitypoint-processspatial-analysisspatial-dataspatial-data-analysisspatstatstatistical-analysisstatistical-methodsstatistical-testsstatistics
4.0 match 6 stars 11.07 score 186 scripts 228 dependentsschochastics
networkdata:Repository of Network Datasets
The package contains a large collection of network dataset with different context. This includes social networks, animal networks and movie networks. All datasets are in 'igraph' format.
Maintained by David Schoch. Last updated 12 months ago.
8.5 match 143 stars 5.01 score 143 scriptscran
MASS:Support Functions and Datasets for Venables and Ripley's MASS
Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002).
Maintained by Brian Ripley. Last updated 17 days ago.
3.6 match 19 stars 10.53 score 11k dependentsrudeboybert
fivethirtyeight:Data and Code Behind the Stories and Interactives at 'FiveThirtyEight'
Datasets and code published by the data journalism website 'FiveThirtyEight' available at <https://github.com/fivethirtyeight/data>. Note that while we received guidance from editors at 'FiveThirtyEight', this package is not officially published by 'FiveThirtyEight'.
Maintained by Albert Y. Kim. Last updated 2 years ago.
data-sciencedatajournalismfivethirtyeightstatistics
3.4 match 453 stars 10.98 score 1.7k scriptsapwheele
ptools:Tools for Poisson Data
Functions used for analyzing count data, mostly crime counts. Includes checking difference in two Poisson counts (e-test), checking the fit for a Poisson distribution, small sample tests for counts in bins, Weighted Displacement Difference test (Wheeler and Ratcliffe, 2018) <doi:10.1186/s40163-018-0085-5>, to evaluate crime changes over time in treated/control areas. Additionally includes functions for aggregating spatial data and spatial feature engineering.
Maintained by Andrew Wheeler. Last updated 1 years ago.
crime-analysiscriminal-justicecriminology
8.3 match 5 stars 4.44 score 11 scriptshumaniverse
healthyr:R package for mapping UK health data
A package to distribute and summarise on UK health data.
Maintained by Mike Page. Last updated 1 months ago.
7.5 match 4 stars 4.86 score 90 scriptspmair78
smacof:Multidimensional Scaling
Implements the following approaches for multidimensional scaling (MDS) based on stress minimization using majorization (smacof): ratio/interval/ordinal/spline MDS on symmetric dissimilarity matrices, MDS with external constraints on the configuration, individual differences scaling (idioscal, indscal), MDS with spherical restrictions, and ratio/interval/ordinal/spline unfolding (circular restrictions, row-conditional). Various tools and extensions like jackknife MDS, bootstrap MDS, permutation tests, MDS biplots, gravity models, unidimensional scaling, drift vectors (asymmetric MDS), classical scaling, and Procrustes are implemented as well.
Maintained by Patrick Mair. Last updated 5 months ago.
4.5 match 5 stars 7.86 score 152 scripts 24 dependentscran
RSDA:R to Symbolic Data Analysis
Symbolic Data Analysis (SDA) was proposed by professor Edwin Diday in 1987, the main purpose of SDA is to substitute the set of rows (cases) in the data table for a concept (second order statistical unit). This package implements, to the symbolic case, certain techniques of automatic classification, as well as some linear models.
Maintained by Oldemar Rodriguez. Last updated 1 years ago.
10.8 match 1 stars 3.26 score 3 dependentsrh8liuqy
GUD:Bayesian Modal Regression Based on the GUD Family
Provides probability density functions and sampling algorithms for three key distributions from the General Unimodal Distribution (GUD) family: the Flexible Gumbel (FG) distribution, the Double Two-Piece (DTP) Student-t distribution, and the Two-Piece Scale (TPSC) Student-t distribution. Additionally, this package includes a function for Bayesian linear modal regression, leveraging these three distributions for model fitting. The details of the Bayesian modal regression model based on the GUD family can be found at Liu, Huang, and Bai (2024) <doi:10.1016/j.csda.2024.108012>.
Maintained by Qingyang Liu. Last updated 9 months ago.
7.3 match 5 stars 4.70 score 2 scriptsfriendly
ggbiplot:A Grammar of Graphics Implementation of Biplots
A 'ggplot2' based implementation of biplots, giving a representation of a dataset in a two dimensional space accounting for the greatest variance, together with variable vectors showing how the data variables relate to this space. It provides a replacement for stats::biplot(), but with many enhancements to control the analysis and graphical display. It implements biplot and scree plot methods which can be used with the results of prcomp(), princomp(), FactoMineR::PCA(), ade4::dudi.pca() or MASS::lda() and can be customized using 'ggplot2' techniques.
Maintained by Michael Friendly. Last updated 5 months ago.
biplotdata-visualizationdimension-reductionprincipal-component-analysis
4.0 match 12 stars 8.15 score 2.4k scripts 1 dependentssbgraves237
Ecdat:Data Sets for Econometrics
Data sets for econometrics, including political science.
Maintained by Spencer Graves. Last updated 4 months ago.
4.0 match 2 stars 7.25 score 740 scripts 3 dependentsfriendly
Guerry:Maps, Data and Methods Related to Guerry (1833) "Moral Statistics of France"
Contains maps of France in 1830 and multivariate datasets from A.-M. Guerry and others. Statistical and graphic methods related to Guerry's "Moral Statistics of France" are used to understand Guerry's data and illustrate methods. The goal is to facilitate the exploration and development of statistical and graphic methods for multivariate data in a geospatial context of historical interest.
Maintained by Michael Friendly. Last updated 2 months ago.
francemoral-statisticsmultivariate-spatial-analysis
6.1 match 1 stars 4.72 score 53 scriptsglsnow
TeachingDemos:Demonstrations for Teaching and Learning
Demonstration functions that can be used in a classroom to demonstrate statistical concepts, or on your own to better understand the concepts or the programming.
Maintained by Greg Snow. Last updated 1 years ago.
4.0 match 7.18 score 760 scripts 13 dependentskzst
nda:Generalized Network-Based Dimensionality Reduction and Analysis
Non-parametric dimensionality reduction function. Reduction with and without feature selection. Plot functions. Automated feature selections. Kosztyan et. al. (2024) <doi:10.1016/j.eswa.2023.121779>.
Maintained by Zsolt T. Kosztyan. Last updated 29 days ago.
6.9 match 2 stars 4.11 score 1 scriptsopenintrostat
usdata:Data on the States and Counties of the United States
Demographic data on the United States at the county and state levels spanning multiple years.
Maintained by Mine Çetinkaya-Rundel. Last updated 10 months ago.
4.0 match 9 stars 6.89 score 294 scripts 1 dependentspauljohn32
rockchalk:Regression Estimation and Presentation
A collection of functions for interpretation and presentation of regression analysis. These functions are used to produce the statistics lectures in <https://pj.freefaculty.org/guides/>. Includes regression diagnostics, regression tables, and plots of interactions and "moderator" variables. The emphasis is on "mean-centered" and "residual-centered" predictors. The vignette 'rockchalk' offers a fairly comprehensive overview. The vignette 'Rstyle' has advice about coding in R. The package title 'rockchalk' refers to our school motto, 'Rock Chalk Jayhawk, Go K.U.'.
Maintained by Paul E. Johnson. Last updated 3 years ago.
3.8 match 7.13 score 584 scripts 18 dependentsmstasinopoulos
gamlss.data:Data for Generalised Additive Models for Location Scale and Shape
Data used as examples in the current two books on Generalised Additive Models for Location Scale and Shape introduced by Rigby and Stasinopoulos (2005), <doi:10.1111/j.1467-9876.2005.00510.x>.
Maintained by Mikis Stasinopoulos. Last updated 1 years ago.
3.8 match 7.04 score 108 scripts 49 dependentsnickch-k
causaldata:Example Data Sets for Causal Inference Textbooks
Example data sets to run the example problems from causal inference textbooks. Currently, contains data sets for Huntington-Klein, Nick (2021) "The Effect" <https://theeffectbook.net>, first and second edition, Cunningham, Scott (2021, ISBN-13: 978-0-300-25168-5) "Causal Inference: The Mixtape", and Hernán, Miguel and James Robins (2020) "Causal Inference: What If" <https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/>.
Maintained by Nick Huntington-Klein. Last updated 4 months ago.
3.5 match 136 stars 7.43 score 144 scripts 1 dependentsjrnold
smss:Datasets for Agresti and Finlay's "Statistical Methods for the Social Sciences"
Datasets used in "Statistical Methods for the Social Sciences" (SMSS) by Alan Agresti and Barbara Finlay.
Maintained by Jeffrey B. Arnold. Last updated 9 years ago.
12.0 match 2.10 score 25 scriptsnjtierney
maxcovr:A Set of Tools For Solving The Maximal Covering Location Problem
Solving the "maximal covering location problem" as described by Church can be difficult for users not familiar with linear programming. maxcovr provides functions to make it easy to solve this problem, and tools to calculate facility coverage.
Maintained by Nicholas Tierney. Last updated 4 months ago.
4.0 match 44 stars 6.06 score 43 scriptskwstat
nipals:Principal Components Analysis using NIPALS or Weighted EMPCA, with Gram-Schmidt Orthogonalization
Principal Components Analysis of a matrix using Non-linear Iterative Partial Least Squares or weighted Expectation Maximization PCA with Gram-Schmidt orthogonalization of the scores and loadings. Optimized for speed. See Andrecut (2009) <doi:10.1089/cmb.2008.0221>.
Maintained by Kevin Wright. Last updated 4 months ago.
3.4 match 7 stars 7.13 score 40 scripts 4 dependentssvmiller
stevedata:Steve's Toy Data for Teaching About a Variety of Methodological, Social, and Political Topics
This is a collection of various kinds of data with broad uses for teaching. My students, and academics like me who teach the same topics I teach, should find this useful if their teaching workflow is also built around the R programming language. The applications are multiple but mostly cluster on topics of statistical methodology, international relations, and political economy.
Maintained by Steve Miller. Last updated 5 days ago.
3.8 match 8 stars 5.97 score 178 scriptsmarkusloecher
RgoogleMaps:Overlays on Static Maps
Serves two purposes: (i) Provide a comfortable R interface to query the Google server for static maps, and (ii) Use the map as a background image to overlay plots within R. This requires proper coordinate scaling.
Maintained by Markus Loecher. Last updated 1 years ago.
3.8 match 1 stars 5.80 score 516 scripts 9 dependentshanmingwu1103
dataSDA:Data Sets for Symbolic Data Analysis
Collects a diverse range of symbolic data and offers a comprehensive set of functions that facilitate the conversion of traditional data into the symbolic data format.
Maintained by Han-Ming Wu. Last updated 2 years ago.
7.6 match 2.70 score 2 scriptsstan-dev
loo:Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models
Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.
Maintained by Jonah Gabry. Last updated 4 days ago.
bayesbayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticscross-validationinformation-criterionmodel-comparisonstan
1.2 match 152 stars 17.30 score 2.6k scripts 297 dependentsda-wi
cartographr:Crafting Print-Ready Maps and Layered Visualizations
Simplifying the creation of print-ready maps, this package offers a user-friendly interface derived from 'ggplot2' for handling OpenStreetMap data. It streamlines the map-making process, allowing users to focus on the story their maps tell. Transforming raw geospatial data into informative visualizations is made easy with simple features 'sf' geometries. Whether for urban planning, environmental studies, or impactful public presentations, this tool facilitates straightforward and effective map creation. Enhance the dissemination of spatial information with high-quality, narrative-driven visualizations!
Maintained by David Willinger. Last updated 9 months ago.
4.0 match 1 stars 4.48 score 2 scriptsjfrench
api2lm:Functions and Data Sets for the Book "A Progressive Introduction to Linear Models"
A lightweight package supporting aspects of linear regression analysis such as simultaneous inference and diagnostics. Additionally, supports "A Progressive Introduction to Linear Models" by Joshua French (<https://jfrench.github.io/LinearRegression/>).
Maintained by Joshua P. French. Last updated 3 months ago.
4.0 match 1 stars 4.43 score 27 scriptsjverzani
UsingR:Data Sets, Etc. for the Text "Using R for Introductory Statistics", Second Edition
A collection of data sets to accompany the textbook "Using R for Introductory Statistics," second edition.
Maintained by John Verzani. Last updated 3 years ago.
3.5 match 1 stars 4.97 score 1.4k scriptscran
cluster.datasets:Cluster Analysis Data Sets
A collection of data sets for teaching cluster analysis.
Maintained by Frederick Novomestky. Last updated 11 years ago.
7.5 match 2.00 scoremanalytics
stppSim:Spatiotemporal Point Patterns Simulation
Generates artificial point patterns marked by their spatial and temporal signatures. The resulting point cloud may exhibit inherent interactions between both signatures. The simulation integrates microsimulation (Holm, E., (2017)<doi:10.1002/9781118786352.wbieg0320>) and agent-based models (Bonabeau, E., (2002)<doi:10.1073/pnas.082080899>), beginning with the configuration of movement characteristics for the specified agents (referred to as 'walkers') and their interactions within the simulation environment. These interactions (Quaglietta, L. and Porto, M., (2019)<doi:10.1186/s40462-019-0154-8>) result in specific spatiotemporal patterns that can be visualized, analyzed, and used for various analytical purposes. Given the growing scarcity of detailed spatiotemporal data across many domains, this package provides an alternative data source for applications in social and life sciences.
Maintained by Monsuru Adepeju. Last updated 8 months ago.
3.2 match 4 stars 4.60 score 5 scriptsmeadhbh-oneill
smoothic:Variable Selection Using a Smooth Information Criterion
Implementation of the SIC epsilon-telescope method, either using single or distributional (multiparameter) regression. Includes classical regression with normally distributed errors and robust regression, where the errors are from the Laplace distribution. The "smooth generalized normal distribution" is used, where the estimation of an additional shape parameter allows the user to move smoothly between both types of regression. See O'Neill and Burke (2022) "Robust Distributional Regression with Automatic Variable Selection" for more details. <arXiv:2212.07317>. This package also contains the data analyses from O'Neill and Burke (2023). "Variable selection using a smooth information criterion for distributional regression models". <doi:10.1007/s11222-023-10204-8>.
Maintained by Meadhbh ONeill. Last updated 2 years ago.
4.0 match 1 stars 3.70 score 3 scriptsmariarizzo
RbyExample:Data for the Book "R by Example"
Data for the examples and exercises in the book "R by Example". Jim Albert and Maria Rizzo (2012, ISBN 978-1-4614-1365-3).
Maintained by Maria Rizzo. Last updated 8 months ago.
3.8 match 2 stars 3.94 score 22 scriptslightbluetitan
educationR:A Comprehensive Collection of Educational Datasets
Provides a comprehensive collection of datasets related to education, covering topics such as student performance, learning methods, test scores, absenteeism, and other educational metrics. This package is designed as a resource for educational researchers, data analysts, and statisticians to explore and analyze data in the field of education.
Maintained by Renzo Caceres Rossi. Last updated 4 months ago.
3.4 match 4 stars 4.30 score 3 scriptsshwasoo
DIFM:Dynamic ICAR Spatiotemporal Factor Models
Bayesian factor models are effective tools for dimension reduction. This is especially applicable to multivariate large-scale datasets. It allows researchers to understand the latent factors of the data which are the linear or non-linear combination of the variables. Dynamic Intrinsic Conditional Autocorrelative Priors (ICAR) Spatiotemporal Factor Models 'DIFM' package provides function to run Markov Chain Monte Carlo (MCMC), evaluation methods and visual plots from Shin and Ferreira (2023)<doi:10.1016/j.spasta.2023.100763>. Our method is a class of Bayesian factor model which can account for spatial and temporal correlations. By incorporating these correlations, the model can capture specific behaviors and provide predictions.
Maintained by Hwasoo Shin. Last updated 11 months ago.
7.3 match 2.00 score 2 scriptsamerican-institutes-for-research
EdSurvey:Analysis of NCES Education Survey and Assessment Data
Read in and analyze functions for education survey and assessment data from the National Center for Education Statistics (NCES) <https://nces.ed.gov/>, including National Assessment of Educational Progress (NAEP) data <https://nces.ed.gov/nationsreportcard/> and data from the International Assessment Database: Organisation for Economic Co-operation and Development (OECD) <https://www.oecd.org/en/about/directorates/directorate-for-education-and-skills.html>, including Programme for International Student Assessment (PISA), Teaching and Learning International Survey (TALIS), Programme for the International Assessment of Adult Competencies (PIAAC), and International Association for the Evaluation of Educational Achievement (IEA) <https://www.iea.nl/>, including Trends in International Mathematics and Science Study (TIMSS), TIMSS Advanced, Progress in International Reading Literacy Study (PIRLS), International Civic and Citizenship Study (ICCS), International Computer and Information Literacy Study (ICILS), and Civic Education Study (CivEd).
Maintained by Paul Bailey. Last updated 17 days ago.
1.8 match 10 stars 7.86 score 139 scripts 1 dependentsmichaelwrobbins
microsynth:Synthetic Control Methods with Micro- And Meso-Level Data
A generalization of the 'Synth' package that is designed for data at a more granular level (e.g., micro-level). Provides functions to construct weights (including propensity score-type weights) and run analyses for synthetic control methods with micro- and meso-level data; see Robbins, Saunders, and Kilmer (2017) <doi:10.1080/01621459.2016.1213634> and Robbins and Davenport (2021) <doi:10.18637/jss.v097.i02>.
Maintained by Michael Robbins. Last updated 2 years ago.
3.6 match 1 stars 3.71 score 34 scriptscran
SmoothTensor:A Collection of Smooth Tensor Estimation Methods
A list of methods for estimating a smooth tensor with an unknown permutation. It also contains several multi-variate functions for generating permuted signal tensors and corresponding observed tensors. For a detailed introduction for the model and estimation techniques, see the paper by Chanwoo Lee and Miaoyan Wang (2021) "Smooth tensor estimation with unknown permutations" <arXiv:2111.04681>.
Maintained by Chanwoo Lee. Last updated 3 years ago.
7.2 match 1.70 scorefernandotusell
cat:Analysis and Imputation of Categorical-Variable Datasets with Missing Values
Performs analysis of categorical-variable with missing values. Implements methods from Schafer, JL, Analysis of Incomplete Multivariate Data, Chapman and Hall.
Maintained by Fernando Tusell. Last updated 2 years ago.
3.6 match 3.27 score 52 scripts 2 dependentstyee001
VGAMdata:Data Supporting the 'VGAM' Package
Mainly data sets to accompany the VGAM package and the book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7>. These are used to illustrate vector generalized linear and additive models (VGLMs/VGAMs), and associated models (Reduced-Rank VGLMs, Quadratic RR-VGLMs, Row-Column Interaction Models, and constrained and unconstrained ordination models in ecology). This package now contains some old VGAM family functions which have been replaced by newer ones (often because they are now special cases).
Maintained by Thomas Yee. Last updated 1 months ago.
3.8 match 1 stars 2.94 score 95 scripts 1 dependentscran
vannstats:Simplified Statistical Procedures for Social Sciences
Simplifies functions assess normality for bivariate and multivariate statistical techniques. Includes functions designed to replicate plots and tables that would result from similar calls in 'SPSS', including hst(), box(), qq(), tab(), cormat(), and residplot(). Also includes simplified formulae, such as mode(), scatter(), p.corr(), ow.anova(), and rm.anova().
Maintained by Burrel Vann Jr. Last updated 2 months ago.
3.5 match 3.06 scoreavdrark
cmm:Categorical Marginal Models
Quite extensive package for maximum likelihood estimation and weighted least squares estimation of categorical marginal models (CMMs; e.g., Bergsma and Rudas, 2002, <http://www.jstor.org/stable/2700006?; Bergsma, Croon and Hagenaars, 2009, <DOI:10.1007/b12532>.
Maintained by L. A. van der Ark. Last updated 2 years ago.
3.6 match 2.73 score 25 scripts 4 dependentsjacobkap
crimeutils:A Comprehensive Set of Functions to Clean, Analyze, and Present Crime Data
A collection of functions that make it easier to understand crime (or other) data, and assist others in understanding it. The package helps you read data from various sources, clean it, fix column names, and graph the data.
Maintained by Jacob Kaplan. Last updated 2 years ago.
3.3 match 1 stars 2.78 score 12 scriptsguangbaog
COR:The COR for Optimal Subset Selection in Distributed Estimation
An algorithm of optimal subset selection, related to Covariance matrices, observation matrices and Response vectors (COR) to select the optimal subsets in distributed estimation. The philosophy of the package is described in Guo G. (2024) <doi:10.1007/s11222-024-10471-z>.
Maintained by Guangbao Guo. Last updated 3 months ago.
3.8 match 2.41 score 13 scriptscran
SDAResources:Datasets and Functions for 'Sampling: Design and Analysis, 3rd Edition'
Includes all the datasets of 'Sampling: Design and Analysis' (3rd edition by Sharon Lohr) in R format and additional functions for analyzing and graphing probability samples.
Maintained by Yan Lu. Last updated 3 years ago.
4.5 match 2.00 scoreprabhanjan-tattar
ACSWR:A Companion Package for the Book "A Course in Statistics with R"
A book designed to meet the requirements of masters students. Tattar, P.N., Suresh, R., and Manjunath, B.G. "A Course in Statistics with R", J. Wiley, ISBN 978-1-119-15272-9.
Maintained by Prabhanjan Tattar. Last updated 10 years ago.
4.0 match 2.03 score 106 scriptswilliamqjw
lmreg:Data and Functions Used in Linear Models and Regression with R: An Integrated Approach
Data files and a few functions used in the book 'Linear Models and Regression with R: An Integrated Approach' by Debasis Sengupta and Sreenivas Rao Jammalamadaka (2019).
Maintained by Jinwen Qiu. Last updated 6 years ago.
3.8 match 2.06 score 116 scriptselipousson
mapbaltimore:Make maps for Baltimore City with open data
This package provides data from the Baltimore City, the state of Maryland, and other sources, functions to access additional data, and function to create and modify simple maps of Baltimore neighborhoods using sf and ggplot2.
Maintained by Eli Pousson. Last updated 4 months ago.
1.8 match 17 stars 3.85 score 14 scriptscran
addScales:Adds Labeled Center Line and Scale Lines/Regions to Trellis Plots
Modifies trellis objects by adding horizontal and/or vertical reference lines or shaded regions that provide visual scaling information. This is mostly useful in multi-panel plots that use the relation = 'free' option in their 'scales' argument list.
Maintained by Bert Gunter. Last updated 5 years ago.
3.4 match 2.00 scorecran
pdR:Threshold Model and Unit Root Tests in Cross-Section and Time Series Data
Threshold model, panel version of Hylleberg et al. (1990) <DOI:10.1016/0304-4076(90)90080-D> seasonal unit root tests, and panel unit root test of Chang (2002) <DOI:10.1016/S0304-4076(02)00095-7>.
Maintained by Ho Tsung-wu. Last updated 7 months ago.
3.6 match 4 stars 1.90 scoreprabhanjan-tattar
gpk:100 Data Sets for Statistics Education
Collection of datasets as prepared by Profs. A.P. Gore, S.A. Paranjape, and M.B. Kulkarni of Department of Statistics, Poona University, India. With their permission, first letter of their names forms the name of this package, the package has been built by me and made available for the benefit of R users. This collection requires a rich class of models and can be a very useful building block for a beginner.
Maintained by Prabhanjan Tattar. Last updated 12 years ago.
4.0 match 1.69 score 49 scriptsdcwheels
gwrr:Fits Geographically Weighted Regression Models with Diagnostic Tools
Fits geographically weighted regression (GWR) models and has tools to diagnose and remediate collinearity in the GWR models. Also fits geographically weighted ridge regression (GWRR) and geographically weighted lasso (GWL) models. See Wheeler (2009) <doi:10.1068/a40256> and Wheeler (2007) <doi:10.1068/a38325> for more details.
Maintained by David Wheeler. Last updated 3 years ago.
4.5 match 2 stars 1.41 score 13 scriptsnseg4
durhamSLR:The durhamSLR package
Data for Statistical Learning modules at Durham University.
Maintained by Sarah.Heaps. Last updated 2 years ago.
3.5 match 1.70 scorekjetil1001
SenSrivastava:Datasets from Sen & Srivastava
Collection of datasets from Sen & Srivastava: "Regression Analysis, Theory, Methods and Applications", Springer. Sources for individual data files are more fully documented in the book.
Maintained by Kjetil B Halvorsen. Last updated 1 years ago.
3.3 match 1.76 score 57 scriptscran
fairml:Fair Models in Machine Learning
Fair machine learning regression models which take sensitive attributes into account in model estimation. Currently implementing Komiyama et al. (2018) <http://proceedings.mlr.press/v80/komiyama18a/komiyama18a.pdf>, Zafar et al. (2019) <https://www.jmlr.org/papers/volume20/18-262/18-262.pdf> and my own approach from Scutari, Panero and Proissl (2022) <https://link.springer.com/content/pdf/10.1007/s11222-022-10143-w.pdf> that uses ridge regression to enforce fairness.
Maintained by Marco Scutari. Last updated 2 years ago.
3.8 match 1 stars 1.52 score 1 dependentsgianmarcoalberti
caplot:Correspondence Analysis with Geometric Frequency Interpretation
Performs Correspondence Analysis on the given dataframe and plots the results in a scatterplot that emphasizes the geometric interpretation aspect of the analysis, following Borg-Groenen (2005) and Yelland (2010). It is particularly useful for highlighting the relationships between a selected row (or column) category and the column (or row) categories. See Borg-Groenen (2005, ISBN:978-0-387-28981-6); Yelland (2010) <doi:10.3888/tmj.12-4>.
Maintained by Gianmarco Alberti. Last updated 2 years ago.
3.3 match 1.70 score 1 scriptsigorlaltuf
ispdata:Access Data from the Public Security Institute of the State of Rio De Janeiro
Allows access to data from the Rio de Janeiro Public Security Institute (ISP), such as criminal statistics, data on gun seizures and femicide. The package also contains the spatial data of Pacifying Police Units (UPPs) and Integrated Public Safety Regions, Areas and Circumscriptions.
Maintained by Igor Laltuf. Last updated 1 months ago.
1.7 match 4 stars 3.30 score 4 scriptsjacobkap
UCR.ColumnNames:Fixes Column Names for Uniform Crime Report "Offenses Known and Clearance by Arrest" Datasets
Changes the column names of the inputted dataset to the correct names from the Uniform Crime Report codebook for the "Offenses Known and Clearance by Arrest" datasets from 1998-2014.
Maintained by Jacob Kaplan. Last updated 8 years ago.
5.0 match 1.00 score 1 scriptsxuwenzhu20
MatTransMix:Clustering with Matrix Gaussian and Matrix Transformation Mixture Models
Provides matrix Gaussian mixture models, matrix transformation mixture models and their model-based clustering results. The parsimonious models of the mean matrices and variance covariance matrices are implemented with a total of 196 variations. For more information, please check: Xuwen Zhu, Shuchismita Sarkar, and Volodymyr Melnykov (2021), "MatTransMix: an R package for matrix model-based clustering and parsimonious mixture modeling", <doi:10.1007/s00357-021-09401-9>.
Maintained by Xuwen Zhu. Last updated 2 months ago.
4.5 match 1.00 scorecran
pointdensityP:Point Density for Geospatial Data
The function pointdensity returns a density count and the temporal average for every point in the original list. The dataframe returned includes four columns: lat, lon, count, and date_avg. The "lat" column is the original latitude data; the "lon" column is the original longitude data; the "count" is the density count of the number of points within a radius of radius*grid_size (the neighborhood); and the date_avg column includes the average date of each point in the neighborhood.
Maintained by Paul Evangelista. Last updated 4 years ago.
4.0 match 1 stars 1.00 scorecran
mogavs:Multiobjective Genetic Algorithm for Variable Selection in Regression
Functions for exploring the best subsets in regression with a genetic algorithm. The package is much faster than methods relying on complete enumeration, and is suitable for data sets with large number of variables. For more information, see Sinha, Malo & Kuosmanen (2015) <doi:10.1080/10618600.2014.899236>.
Maintained by Tommi Pajala. Last updated 7 years ago.
3.6 match 1.00 scorempjashby
sfhotspot:Hot-Spot Analysis with Simple Features
Identify and understand clusters of points (typically representing the locations of places or events) stored in simple-features (SF) objects. This is useful for analysing, for example, hot-spots of crime events. The package emphasises producing results from point SF data in a single step using reasonable default values for all other arguments, to aid rapid data analysis by users who are starting out. Functions available include kernel density estimation (for details, see Yip (2020) <doi:10.22224/gistbok/2020.1.12>), analysis of spatial association (Getis and Ord (1992) <doi:10.1111/j.1538-4632.1992.tb00261.x>) and hot-spot classification (Chainey (2020) ISBN:158948584X).
Maintained by Matt Ashby. Last updated 25 days ago.
hotspothotspotshotspots-analysismappingmapping-tools
0.5 match 12 stars 5.56 score 30 scriptsimranshakoor
countDM:Estimation of Count Data Models
The maximum likelihood estimation (MLE) of the count data models along with standard error of the estimates and Akaike information model section criterion are provided. The functions allow to compute the MLE for the following distributions such as the Bell distribution, the Borel distribution, the Poisson distribution, zero inflated Bell distribution, zero inflated Bell Touchard distribution, zero inflated Poisson distribution, zero one inflated Bell distribution and zero one inflated Poisson distribution. Moreover, the probability mass function (PMF), distribution function (CDF), quantile function (QF) and random numbers generation of the Bell Touchard and zero inflated Bell Touchard distribution are also provided.
Maintained by Muhammad Imran. Last updated 2 years ago.
1.7 match 1 stars 1.00 scoreimranshakoor
DDPM:Data Sets for Discrete Probability Models
A wide collection of univariate discrete data sets from various applied domains related to distribution theory. The functions allow quick, easy, and efficient access to 100 univariate discrete data sets. The data are related to different applied domains, including medical, reliability analysis, engineering, manufacturing, occupational safety, geological sciences, terrorism, psychology, agriculture, environmental sciences, road traffic accidents, demography, actuarial science, law, and justice. The documentation, along with associated references for further details and uses, is presented.
Maintained by Muhammad Imran. Last updated 2 years ago.
1.7 match 1.00 scoreropengov
oldbailey:For Accessing the Old Bailey Open Data
Fetch trial data from the Old Bailey Online API <https://www.oldbaileyonline.org/static/DocAPI.jsp>. Data is returned in an analysis-ready data frame with fields for metadata including (but not limited to) the names of the first person speakers, defendants, victims, their recorded genders, verdicts, punishments, crime locations, and dates. Optional parameters allow users to specify the number of results, whether these results contain key terms, and trial dates.
Maintained by Steph Buongiorno. Last updated 2 years ago.
0.5 match 3 stars 2.18 score