Showing 47 of total 47 results (show query)
s87jackson
rfars:Download and Analyze Crash Data
Download crash data from the National Highway Traffic Safety Administration and prepare it for research.
Maintained by Steve Jackson. Last updated 12 months ago.
crashfatalitiesofficial-statisticstransportation
44.7 match 10 stars 5.35 score 15 scriptsropensci
stats19:Work with Open Road Traffic Casualty Data from Great Britain
Tools to help download, process and analyse the UK road collision data collected using the 'STATS19' form. The datasets are provided as 'CSV' files with detailed road safety information about the circumstances of car crashes and other incidents on the roads resulting in casualties in Great Britain from 1979 to present. Tables are available on 'colissions' with the circumstances (e.g. speed limit of road), information about 'vehicles' involved (e.g. type of vehicle), and 'casualties' (e.g. age). The statistics relate only to events on public roads that were reported to the police, and subsequently recorded, using the 'STATS19' collision reporting form. See the Department for Transport website <https://www.data.gov.uk/dataset/cb7ae6f0-4be6-4935-9277-47e5ce24a11f/road-accidents-safety-data> for more information on these datasets. The package is described in a paper in the Journal of Open Source Software (Lovelace et al. 2019) <doi:10.21105/joss.01181>. See Gilardi et al. (2022) <doi:10.1111/rssa.12823>, Vidal-Tortosa et al. (2021) <doi:10.1016/j.jth.2021.101291>, and Tait et al. (2023) <doi:10.1016/j.aap.2022.106895> for examples of how the data can be used for methodological and empirical road safety research.
Maintained by Robin Lovelace. Last updated 2 months ago.
stats19road-safetytransportcar-crashesropenscidata
17.0 match 64 stars 9.20 score 193 scriptsgavinrozzi
njtr1:Download, Analyze & Clean New Jersey Car Crash Data
Download and analyze motor vehicle crash data released by the New Jersey Department of Transportation (NJDOT). The data in this package is collected through the filing of NJTR-1 form by police officers, which provide a standardized way of documenting a motor vehicle crash that occurred in New Jersey. 3 different data tables containing data on crashes, vehicles & pedestrians released from 2001 to the present can be downloaded & cleaned using this package.
Maintained by Gavin Rozzi. Last updated 1 years ago.
njtr1new-jerseyroad-safetycar-crashescar-accidentsdata
12.8 match 5 stars 4.40 score 7 scriptsbioc
ORFik:Open Reading Frames in Genomics
R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.
Maintained by Haakon Tjeldnes. Last updated 28 days ago.
immunooncologysoftwaresequencingriboseqrnaseqfunctionalgenomicscoveragealignmentdataimportcpp
5.0 match 33 stars 10.63 score 115 scripts 2 dependentsnjtierney
ozroaddeaths:Pulls Data From Australian Road Deaths Database
ozroaddeaths is a package that pulls data from the Australian Road Deaths Database, run by the Bureau of Infrastructure, Transport and Regional Economics (BITRE). This provides basic details of road transport crash fatalities in Australia as reported by the police each month to the State and Territory road safety authorities. The details provided in the database fall into two groups: 1) the circumstances of the crash, for example, date, location, crash type some details regarding the persons killed, for example, age, gender and road user group.
Maintained by Nicholas Tierney. Last updated 5 months ago.
10.9 match 6 stars 4.62 score 7 scriptsfriendly
vcdExtra:'vcd' Extensions and Additions
Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.
Maintained by Michael Friendly. Last updated 5 months ago.
categorical-data-visualizationgeneralized-linear-modelsmosaic-plots
4.0 match 24 stars 10.34 score 472 scripts 3 dependentselipousson
crashapi:CrashAPI
Get Fatality Analysis Reporting System (FARS) data with the FARS API from the U.S. National Highway Traffic Safety Administration (NHTSA).
Maintained by Eli Pousson. Last updated 6 months ago.
12.7 match 17 stars 3.23 score 7 scriptswlandau
crew:A Distributed Worker Launcher Framework
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The 'NNG'-powered 'mirai' R package by Gao (2023) <doi:10.5281/zenodo.7912722> is a sleek and sophisticated scheduler that efficiently processes these intense workloads. The 'crew' package extends 'mirai' with a unifying interface for third-party worker launchers. Inspiration also comes from packages. 'future' by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, 'rrq' by FitzJohn and Ashton (2023) <https://github.com/mrc-ide/rrq>, 'clustermq' by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and 'batchtools' by Lang, Bischel, and Surmann (2017) <doi:10.21105/joss.00135>.
Maintained by William Michael Landau. Last updated 2 days ago.
3.5 match 136 stars 11.19 score 243 scripts 2 dependentsrudeboybert
fivethirtyeight:Data and Code Behind the Stories and Interactives at 'FiveThirtyEight'
Datasets and code published by the data journalism website 'FiveThirtyEight' available at <https://github.com/fivethirtyeight/data>. Note that while we received guidance from editors at 'FiveThirtyEight', this package is not officially published by 'FiveThirtyEight'.
Maintained by Albert Y. Kim. Last updated 2 years ago.
data-sciencedatajournalismfivethirtyeightstatistics
3.5 match 453 stars 10.98 score 1.7k scriptsmoderndive
moderndive:Tidyverse-Friendly Introductory Linear Regression
Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.
Maintained by Albert Y. Kim. Last updated 3 months ago.
3.3 match 88 stars 11.35 score 1.8k scriptsschochastics
networkdata:Repository of Network Datasets
The package contains a large collection of network dataset with different context. This includes social networks, animal networks and movie networks. All datasets are in 'igraph' format.
Maintained by David Schoch. Last updated 12 months ago.
6.6 match 143 stars 5.01 score 143 scriptsalanarnholt
BSDA:Basic Statistics and Data Analysis
Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.
Maintained by Alan T. Arnholt. Last updated 2 years ago.
3.3 match 7 stars 9.11 score 1.3k scripts 6 dependentstidyverse
lubridate:Make Dealing with Dates a Little Easier
Functions to work with date-times and time-spans: fast and user friendly parsing of date-time data, extraction and updating of components of a date-time (years, months, days, hours, minutes, and seconds), algebraic manipulation on date-time and time-span objects. The 'lubridate' package has a consistent and memorable syntax that makes working with dates easy and fun.
Maintained by Vitalie Spinu. Last updated 3 months ago.
1.2 match 757 stars 20.95 score 135k scripts 1.9k dependentsjhmaindonald
gamclass:Functions and Data for a Course on Modern Regression and Classification
Functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.
Maintained by John Maindonald. Last updated 2 years ago.
5.2 match 4.82 score 44 scriptsr-forge
Sleuth3:Data Sets from Ramsey and Schafer's "Statistical Sleuth (3rd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2013), "The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)", Cengage Learning.
Maintained by Berwin A Turlach. Last updated 1 years ago.
3.6 match 6.38 score 522 scriptspachadotdev
gravity:Estimation Methods for Gravity Models
A wrapper of different standard estimation methods for gravity models. This package provides estimation methods for log-log models and multiplicative models.
Maintained by Mauricio Vargas. Last updated 4 months ago.
bvubvwddmeconometricsglmgpmlgravityinternational-tradelmmaximum-likelihoodnbpmlnlsolsppmlsilstobittrade
3.1 match 35 stars 6.98 score 55 scriptsr-forge
Sleuth2:Data Sets from Ramsey and Schafer's "Statistical Sleuth (2nd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2002), "The Statistical Sleuth: A Course in Methods of Data Analysis (2nd ed)", Duxbury.
Maintained by Berwin A Turlach. Last updated 1 years ago.
3.6 match 5.70 score 191 scriptsmagnusdv
dvir:Disaster Victim Identification
Joint DNA-based disaster victim identification (DVI), as described in Vigeland and Egeland (2021) <doi:10.21203/rs.3.rs-296414/v1>. Identification is performed by optimising the joint likelihood of all victim samples and reference individuals. Individual identification probabilities, conditional on all available information, are derived from the joint solution in the form of posterior pairing probabilities. 'dvir' is part of the 'pedsuite' collection of packages for pedigree analysis.
Maintained by Magnus Dehli Vigeland. Last updated 3 months ago.
3.6 match 3 stars 5.05 score 21 scripts 1 dependentsgoepp
aspline:Spline Regression with Adaptive Knot Selection
Perform one-dimensional spline regression with automatic knot selection. This package uses a penalized approach to select the most relevant knots. B-splines of any degree can be fitted. More details in 'Goepp et al. (2018)', "Spline Regression with Automatic Knot Selection", <arXiv:1808.01770>.
Maintained by Vivien Goepp. Last updated 3 years ago.
adaptive-splinesfitting-splinesknotsregression-splinecpp
4.0 match 6 stars 4.52 score 11 scriptsvathymut
dsos:Dataset Shift with Outlier Scores
Test for no adverse shift in two-sample comparison when we have a training set, the reference distribution, and a test set. The approach is flexible and relies on a robust and powerful test statistic, the weighted AUC. Technical details are in Kamulete, V. M. (2021) <arXiv:1908.04000>. Modern notions of outlyingness such as trust scores and prediction uncertainty can be used as the underlying scores for example.
Maintained by Vathy M. Kamulete. Last updated 2 years ago.
data-driftdata-validationdataset-shiftsdrift-detectionmachine-learningmlopsmodel-monitoringmodel-validationperformance-monitoringstatistical-process-controlstatistical-tests
3.1 match 2 stars 5.08 score 40 scriptsimmunomind
immunarch:Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires
A comprehensive framework for bioinformatics exploratory analysis of bulk and single-cell T-cell receptor and antibody repertoires. It provides seamless data loading, analysis and visualisation for AIRR (Adaptive Immune Receptor Repertoire) data, both bulk immunosequencing (RepSeq) and single-cell sequencing (scRNAseq). Immunarch implements most of the widely used AIRR analysis methods, such as: clonality analysis, estimation of repertoire similarities in distribution of clonotypes and gene segments, repertoire diversity analysis, annotation of clonotypes using external immune receptor databases and clonotype tracking in vaccination and cancer studies. A successor to our previously published 'tcR' immunoinformatics package (Nazarov 2015) <doi:10.1186/s12859-015-0613-1>.
Maintained by Vadim I. Nazarov. Last updated 12 months ago.
airr-analysisb-cell-receptorbcrbcr-repertoirebioinformaticsigig-repertoireimmune-repertoireimmune-repertoire-analysisimmune-repertoire-dataimmunoglobulinimmunoinformaticsimmunologyrep-seqrepertoire-analysissingle-cellsingle-cell-analysist-cell-receptortcrtcr-repertoirecpp
1.6 match 315 stars 9.49 score 203 scriptsoscarkjell
text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Maintained by Oscar Kjell. Last updated 4 days ago.
deep-learningmachine-learningnlptransformersopenjdk
1.2 match 146 stars 13.16 score 436 scripts 1 dependentscran
elrm:Exact Logistic Regression via MCMC
Implements a Markov Chain Monte Carlo algorithm to approximate exact conditional inference for logistic regression models. Exact conditional inference is based on the distribution of the sufficient statistics for the parameters of interest given the sufficient statistics for the remaining nuisance parameters. Using model formula notation, users specify a logistic model and model terms of interest for exact inference. See Zamar et al. (2007) <doi:10.18637/jss.v021.i03> for more details.
Maintained by David Zamar. Last updated 3 months ago.
5.6 match 2.56 score 12 scripts 1 dependentsgbasulto
cureplots:CURE (Cumulative Residual) Plots
Creates 'ggplot2' Cumulative Residual (CURE) plots to check the goodness-of-fit of a count model; or the tables to create a customized version. A dataset of crashes in Washington state is available for illustrative purposes.
Maintained by Guillermo Basulto-Elias. Last updated 2 months ago.
4.5 match 1 stars 3.08 score 12 scriptshugomflavio
actel:Acoustic Telemetry Data Analysis
Designed for studies where animals tagged with acoustic tags are expected to move through receiver arrays. This package combines the advantages of automatic sorting and checking of animal movements with the possibility for user intervention on tags that deviate from expected behaviour. The three analysis functions (explore(), migration() and residency()) allow the users to analyse their data in a systematic way, making it easy to compare results from different studies. CJS calculations are based on Perry et al. (2012) <https://www.researchgate.net/publication/256443823_Using_mark-recapture_models_to_estimate_survival_from_telemetry_data>.
Maintained by Hugo Flávio. Last updated 24 days ago.
acoustic-telemetryfish-packagestelemetry-data
1.8 match 26 stars 6.49 score 16 scriptstyee001
VGAMdata:Data Supporting the 'VGAM' Package
Mainly data sets to accompany the VGAM package and the book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7>. These are used to illustrate vector generalized linear and additive models (VGLMs/VGAMs), and associated models (Reduced-Rank VGLMs, Quadratic RR-VGLMs, Row-Column Interaction Models, and constrained and unconstrained ordination models in ecology). This package now contains some old VGAM family functions which have been replaced by newer ones (often because they are now special cases).
Maintained by Thomas Yee. Last updated 1 months ago.
3.5 match 1 stars 2.94 score 95 scripts 1 dependentscran
fipp:Induced Priors in Bayesian Mixture Models
Computes implicitly induced quantities from prior/hyperparameter specifications of three Mixtures of Finite Mixtures models: Dirichlet Process Mixtures (DPMs; Escobar and West (1995) <doi:10.1080/01621459.1995.10476550>), Static Mixtures of Finite Mixtures (Static MFMs; Miller and Harrison (2018) <doi:10.1080/01621459.2016.1255636>), and Dynamic Mixtures of Finite Mixtures (Dynamic MFMs; Frühwirth-Schnatter, Malsiner-Walli and Grün (2020) <arXiv:2005.09918>). For methodological details, please refer to Greve, Grün, Malsiner-Walli and Frühwirth-Schnatter (2020) <arXiv:2012.12337>) as well as the package vignette.
Maintained by Jan Greve. Last updated 4 years ago.
3.3 match 2.70 scorebrodieg
unitizer:Interactive R Unit Tests
Simplifies regression tests by comparing objects produced by test code with earlier versions of those same objects. If objects are unchanged the tests pass, otherwise execution stops with error details. If in interactive mode, tests can be reviewed through the provided interactive environment.
Maintained by Brodie Gaslam. Last updated 10 months ago.
1.3 match 39 stars 7.18 score 84 scriptscran
DOS2:Design of Observational Studies, Companion to the Second Edition
Contains data sets, examples and software from the Second Edition of "Design of Observational Studies"; see Rosenbaum, P.R. (2010) <doi:10.1007/978-1-4419-1213-8>.
Maintained by Paul Rosenbaum. Last updated 6 years ago.
3.8 match 2 stars 2.24 score 29 scripts 1 dependentscjgeyer
potts:Markov Chain Monte Carlo for Potts Models
Do Markov chain Monte Carlo (MCMC) simulation of Potts models (Potts, 1952, <doi:10.1017/S0305004100027419>), which are the multi-color generalization of Ising models (so, as as special case, also simulates Ising models). Use the Swendsen-Wang algorithm (Swendsen and Wang, 1987, <doi:10.1103/PhysRevLett.58.86>) so MCMC is fast. Do maximum composite likelihood estimation of parameters (Besag, 1975, <doi:10.2307/2987782>, Lindsay, 1988, <doi:10.1090/conm/080>).
Maintained by Charles J. Geyer. Last updated 3 years ago.
3.3 match 2.48 score 30 scriptscran
discSurv:Discrete Time Survival Analysis
Provides data transformations, estimation utilities, predictive evaluation measures and simulation functions for discrete time survival analysis.
Maintained by Thomas Welchowski. Last updated 3 years ago.
3.6 match 2 stars 2.11 score 64 scriptsstan-dev
cmdstanr:R Interface to 'CmdStan'
A lightweight interface to 'Stan' <https://mc-stan.org>. The 'CmdStanR' interface is an alternative to 'RStan' that calls the command line interface for compilation and running algorithms instead of interfacing with C++ via 'Rcpp'. This has many benefits including always being compatible with the latest version of Stan, fewer installation errors, fewer unexpected crashes in RStudio, and a more permissive license.
Maintained by Andrew Johnson. Last updated 9 months ago.
bayesbayesianmarkov-chain-monte-carlomaximum-likelihoodmcmcstanvariational-inference
0.5 match 145 stars 12.27 score 5.2k scripts 9 dependentscdueben
cppcontainers:'C++' Standard Template Library Containers
Use 'C++' Standard Template Library containers interactively in R. Includes sets, unordered sets, multisets, unordered multisets, maps, unordered maps, multimaps, unordered multimaps, stacks, queues, priority queues, vectors, deques, forward lists, and lists.
Maintained by Christian Düben. Last updated 2 months ago.
1.3 match 4.70 score 1 scriptsbioc
combi:Compositional omics model based visual integration
This explorative ordination method combines quasi-likelihood estimation, compositional regression models and latent variable models for integrative visualization of several omics datasets. Both unconstrained and constrained integration are available. The results are shown as interpretable, compositional multiplots.
Maintained by Stijn Hawinkel. Last updated 5 months ago.
metagenomicsdimensionreductionmicrobiomevisualizationmetabolomics
1.3 match 1 stars 4.48 score 7 scriptsdjnavarro
queue:Simple Multi-Threaded Task Queuing
Implements a simple multi-threaded task queue using R6 classes.
Maintained by Danielle Navarro. Last updated 2 years ago.
1.5 match 6 stars 3.97 score 31 scriptsr-lib
liteq:Lightweight Portable Message Queue Using 'SQLite'
Temporary and permanent message queues for R. Built on top of 'SQLite' databases. 'SQLite' provides locking, and makes it possible to detect crashed consumers. Crashed jobs can be automatically marked as "failed", or put in the queue again, potentially a limited number of times.
Maintained by Gábor Csárdi. Last updated 4 months ago.
0.8 match 57 stars 5.91 score 19 scripts 1 dependentsmatthewblackwell
Amelia:A Program for Missing Data
A tool that "multiply imputes" missing data in a single cross-section (such as a survey), from a time series (like variables collected for each year in a country), or from a time-series-cross-sectional data set (such as collected by years for each of several countries). Amelia II implements our bootstrapping-based algorithm that gives essentially the same answers as the standard IP or EMis approaches, is usually considerably faster than existing approaches and can handle many more variables. Unlike Amelia I and other statistically rigorous imputation software, it virtually never crashes (but please let us know if you find to the contrary!). The program also generalizes existing approaches by allowing for trends in time series across observations within a cross-sectional unit, as well as priors that allow experts to incorporate beliefs they have about the values of missing cells in their data. Amelia II also includes useful diagnostics of the fit of multiple imputation models. The program works from the R command line or via a graphical user interface that does not require users to know R.
Maintained by Matthew Blackwell. Last updated 4 months ago.
0.5 match 1 stars 9.06 score 1.4k scripts 7 dependentselipousson
mapmaryland:Easy Access to Maryland Spatial Data
A small collection of data sources and utility functions for working with state and county data sources in Maryland.
Maintained by Eli Pousson. Last updated 5 months ago.
1.8 match 3 stars 2.48 score 4 scriptscedricbriandgithub
stacomiR:Fish Migration Monitoring
Graphical outputs and treatment for a database of fish pass monitoring. It is a part of the 'STACOMI' open source project developed in France by the French Office for Biodiversity institute to centralize data obtained by fish pass monitoring. This version is available in French and English. See <http://stacomir.r-forge.r-project.org/> for more information on 'STACOMI'.
Maintained by Cedric Briand. Last updated 1 years ago.
1.6 match 1 stars 2.43 score 27 scriptsdime-worldbank
ulex:Unique Location Extractor
Extracts coordinates of an event location from text based on dictionaries of landmarks, roads, and areas. Only returns the location of an event of interest and ignores other location references; for example, if determining the location of a road traffic crash from the text "crash near [location 1] heading towards [location 2]", only the coordinates of "location 1" would be returned. Moreover, accounts for differences in spelling between how a user references a location and how a location is captured in location dictionaries. For more information on the algorithm, see Milusheva et al. (2021) <doi:10.1371/journal.pone.0244317>.
Maintained by Robert Marty. Last updated 9 months ago.
0.8 match 1 stars 3.18 score 8 scriptswlandau
autometric:Background Resource Logging
Intense parallel workloads can be difficult to monitor. Packages 'crew.cluster', 'clustermq', and 'future.batchtools' distribute hundreds of worker processes over multiple computers. If a worker process exhausts its available memory, it may terminate silently, leaving the underlying problem difficult to detect or troubleshoot. Using the 'autometric' package, a worker can proactively monitor itself in a detached background thread. The worker process itself runs normally, and the thread writes to a log every few seconds. If the worker terminates unexpectedly, 'autometric' can read and visualize the log file to reveal potential resource-related reasons for the crash. The 'autometric' package borrows heavily from the methods of packages 'ps' <doi:10.32614/CRAN.package.ps> and 'psutil'.
Maintained by William Michael Landau. Last updated 4 months ago.
0.5 match 7 stars 4.38 score 9 scriptsgitumino
fitPoly:Genotype Calling for Bi-Allelic Marker Assays
Genotyping assays for bi-allelic markers (e.g. SNPs) produce signal intensities for the two alleles. 'fitPoly' assigns genotypes (allele dosages) to a collection of polyploid samples based on these signal intensities. 'fitPoly' replaces the older package 'fitTetra' that was limited (a.o.) to only tetraploid populations whereas 'fitPoly' accepts any ploidy level. Reference: Voorrips RE, Gort G, Vosman B (2011) <doi:10.1186/1471-2105-12-172>. New functions added on conversion of data from SNP array software formats, drawing of XY-scatterplots with or without genotype colors, checking against expected F1 segregation patterns, comparing results from two different assays (probes) for the same SNP, recovery from a saveMarkerModels() crash.
Maintained by Giorgio Tumino. Last updated 1 months ago.
0.5 match 4.13 score 15 scripts