Showing 126 of total 126 results (show query)
gesistsa
rio:A Swiss-Army Knife for Data I/O
Streamlined data import and export by making assumptions that the user is probably willing to make: 'import()' and 'export()' determine the data format from the file extension, reasonable defaults are used for data import and export, web-based import is natively supported (including from SSL/HTTPS), compressed files can be read directly, and fast import packages are used where appropriate. An additional convenience function, 'convert()', provides a simple method for converting between file types.
Maintained by Chung-hong Chan. Last updated 3 months ago.
csvcsvydatadata-scienceexcelioriosasspssstata
610 stars 17.10 score 7.8k scripts 74 dependentseasystats
datawizard:Easy Data Wrangling and Statistical Transformations
A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.
Maintained by Etienne Bacher. Last updated 3 days ago.
datadplyrhacktoberfestjanitormanipulationreshapetidyrwrangling
223 stars 14.77 score 436 scripts 120 dependentsropensci
taxize:Taxonomic Information from Around the Web
Interacts with a suite of web application programming interfaces (API) for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more. Some of the services supported include 'NCBI E-utilities' (<https://www.ncbi.nlm.nih.gov/books/NBK25501/>), 'Encyclopedia of Life' (<https://eol.org/docs/what-is-eol/data-services>), 'Global Biodiversity Information Facility' (<https://techdocs.gbif.org/en/openapi/>), and many more. Links to the API documentation for other supported services are available in the documentation for their respective functions in this package.
Maintained by Zachary Foster. Last updated 25 days ago.
taxonomybiologynomenclaturejsonapiwebapi-clientidentifiersspeciesnamesapi-wrapperbiodiversitydarwincoredatataxize
274 stars 13.63 score 1.6k scripts 23 dependentsropensci
rgbif:Interface to the Global Biodiversity Information Facility API
A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/developer/summary>). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.
Maintained by John Waller. Last updated 16 days ago.
gbifspecimensapiweb-servicesoccurrencesspeciestaxonomybiodiversitydatalifewatchoscibiospocc
161 stars 13.26 score 2.1k scripts 20 dependentsopenintrostat
openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs
Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.
Maintained by Mine Çetinkaya-Rundel. Last updated 3 months ago.
240 stars 11.39 score 6.0k scriptspdil
usmap:US Maps Including Alaska and Hawaii
Obtain United States map data frames of varying region types (e.g. county, state). The map data frames include Alaska and Hawaii conveniently placed to the bottom left, as they appear in most maps of the US. Convenience functions for plotting choropleths, visualizing spatial data, and working with FIPS codes are also provided.
Maintained by Paolo Di Lorenzo. Last updated 3 months ago.
countiesdatafipsgeodatamappingstatesusa
75 stars 10.89 score 1.7k scripts 2 dependentsropensci
geojsonio:Convert Data from and to 'GeoJSON' or 'TopoJSON'
Convert data to 'GeoJSON' or 'TopoJSON' from various R classes, including vectors, lists, data frames, shape files, and spatial classes. 'geojsonio' does not aim to replace packages like 'sp', 'rgdal', 'rgeos', but rather aims to be a high level client to simplify conversions of data from and to 'GeoJSON' and 'TopoJSON'.
Maintained by Michael Mahoney. Last updated 1 years ago.
geojsontopojsongeospatialconversiondatainput-outputio
151 stars 10.83 score 2.9k scripts 13 dependentskwstat
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 1 months ago.
126 stars 10.78 score 1.7k scripts 1 dependentsropensci
geojson:Classes for 'GeoJSON'
Classes for 'GeoJSON' to make working with 'GeoJSON' easier. Includes S3 classes for 'GeoJSON' classes with brief summary output, and a few methods such as extracting and adding bounding boxes, properties, and coordinate reference systems; working with newline delimited 'GeoJSON'; and serializing to/from 'Geobuf' binary 'GeoJSON' format.
Maintained by Michael Sumner. Last updated 2 years ago.
geojsongeospatialconversiondatainput-outputbboxpolygongeobufcrsndgeojsonspatial
32 stars 10.56 score 166 scripts 14 dependentsropensci
rebird:R Client for the eBird Database of Bird Observations
A programmatic client for the eBird database (<https://ebird.org/home>), including functions for searching for bird observations by geographic location (latitude, longitude), eBird hotspots, location identifiers, by notable sightings, by region, and by taxonomic name.
Maintained by Sebastian Pardo. Last updated 2 months ago.
birdsbirdingebirddatabasedatabiologyobservationssightingsornithologyebird-apiebird-webservicesspocc
90 stars 10.43 score 73 scripts 6 dependentsropensci
spocc:Interface to Species Occurrence Data Sources
A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility ('GBIF'), 'iNaturalist', 'eBird', Integrated Digitized 'Biocollections' ('iDigBio'), 'VertNet', Ocean 'Biogeographic' Information System ('OBIS'), and Atlas of Living Australia ('ALA'). Includes functionality for retrieving species occurrence data, and combining those data.
Maintained by Hannah Owens. Last updated 2 months ago.
specimensapiweb-servicesoccurrencesspeciestaxonomygbifinatvertnetebirdidigbioobisalaantwebbisondataecoengineinaturalistoccurrencespecies-occurrencespocc
118 stars 10.09 score 552 scripts 5 dependentsropensci
charlatan:Make Fake Data
Make fake data that looks realistic, supporting addresses, person names, dates, times, colors, coordinates, currencies, digital object identifiers ('DOIs'), jobs, phone numbers, 'DNA' sequences, doubles and integers from distributions and within a range.
Maintained by Roel M. Hogervorst. Last updated 2 months ago.
datadatasetfake-datafakerpeer-reviewed
296 stars 10.06 score 180 scripts 1 dependentsiqss
dataverse:Client for Dataverse 4+ Repositories
Provides access to Dataverse APIs <https://dataverse.org/> (versions 4-5), enabling data search, retrieval, and deposit. For Dataverse versions <= 3.0, use the archived 'dvn' package <https://cran.r-project.org/package=dvn>.
Maintained by Shiro Kuriwaki. Last updated 5 months ago.
datadata-depositdataversedataverse-apisword
61 stars 9.98 score 217 scripts 4 dependentshadley
babynames:US Baby Names 1880-2017
US baby names provided by the SSA. This package contains all names used for at least 5 children of either sex.
Maintained by Hadley Wickham. Last updated 4 years ago.
134 stars 9.82 score 1.9k scripts 4 dependentsjulianfaraway
faraway:Datasets and Functions for Books by Julian Faraway
Books are "Linear Models with R" published 1st Ed. August 2004, 2nd Ed. July 2014, 3rd Ed. February 2025 by CRC press, ISBN 9781439887332, and "Extending the Linear Model with R" published by CRC press in 1st Ed. December 2005 and 2nd Ed. March 2016, ISBN 9781584884248 and "Practical Regression and ANOVA in R" contributed documentation on CRAN (now very dated).
Maintained by Julian Faraway. Last updated 2 months ago.
29 stars 9.43 score 1.7k scripts 1 dependentsropensci
stats19:Work with Open Road Traffic Casualty Data from Great Britain
Tools to help download, process and analyse the UK road collision data collected using the 'STATS19' form. The datasets are provided as 'CSV' files with detailed road safety information about the circumstances of car crashes and other incidents on the roads resulting in casualties in Great Britain from 1979 to present. Tables are available on 'colissions' with the circumstances (e.g. speed limit of road), information about 'vehicles' involved (e.g. type of vehicle), and 'casualties' (e.g. age). The statistics relate only to events on public roads that were reported to the police, and subsequently recorded, using the 'STATS19' collision reporting form. See the Department for Transport website <https://www.data.gov.uk/dataset/cb7ae6f0-4be6-4935-9277-47e5ce24a11f/road-accidents-safety-data> for more information on these datasets. The package is described in a paper in the Journal of Open Source Software (Lovelace et al. 2019) <doi:10.21105/joss.01181>. See Gilardi et al. (2022) <doi:10.1111/rssa.12823>, Vidal-Tortosa et al. (2021) <doi:10.1016/j.jth.2021.101291>, and Tait et al. (2023) <doi:10.1016/j.aap.2022.106895> for examples of how the data can be used for methodological and empirical road safety research.
Maintained by Robin Lovelace. Last updated 2 months ago.
stats19road-safetytransportcar-crashesropenscidata
64 stars 9.20 score 193 scriptsdebruine
faux:Simulation for Factorial Designs
Create datasets with factorial structure through simulation by specifying variable parameters. Extended documentation at <https://debruine.github.io/faux/>. Described in DeBruine (2020) <doi:10.5281/zenodo.2669586>.
Maintained by Lisa DeBruine. Last updated 2 months ago.
98 stars 9.14 score 716 scripts 1 dependentsludvigolsen
groupdata2:Creating Groups from Data
Methods for dividing data into groups. Create balanced partitions and cross-validation folds. Perform time series windowing and general grouping and splitting of data. Balance existing groups with up- and downsampling or collapse them to fewer groups.
Maintained by Ludvig Renbo Olsen. Last updated 3 months ago.
balancecross-validationdatadata-framefoldgroup-factorgroupsparticipantspartitionsplitstaircase
27 stars 9.04 score 338 scripts 7 dependentssboysel
fredr:An R Client for the 'FRED' API
An R client for the 'Federal Reserve Economic Data' ('FRED') API <https://research.stlouisfed.org/docs/api/>. Functions to retrieve economic time series and other data from 'FRED'.
Maintained by Sam Boysel. Last updated 4 years ago.
apiclientdataeconomicfederal-reservefredfred-apifred-seriesfredr
93 stars 8.95 score 700 scriptsropensci
opentripplanner:Setup and connect to 'OpenTripPlanner'
Setup and connect to 'OpenTripPlanner' (OTP) <http://www.opentripplanner.org/>. OTP is an open source platform for multi-modal and multi-agency journey planning written in 'Java'. The package allows you to manage a local version or connect to remote OTP server to find walking, cycling, driving, or transit routes. This package has been peer-reviewed by rOpenSci (v. 0.2.0.0).
Maintained by Malcolm Morgan. Last updated 3 months ago.
dataisochronesjavaopentripplannerotppublic-transportroutingtransporttransportation-planning
83 stars 8.94 score 147 scriptsusa-npn
rnpn:Interface to the National 'Phenology' Network 'API'
Programmatic interface to the Web Service methods provided by the National 'Phenology' Network (<https://usanpn.org/>), which includes data on various life history events that occur at specific times.
Maintained by Jeff Switzer. Last updated 3 days ago.
datanational-phenology-networkphenologyspeciesweb-api
21 stars 8.91 score 109 scriptswjakethompson
taylor:Lyrics and Song Data for Taylor Swift's Discography
A comprehensive resource for data on Taylor Swift songs. Data is included for all officially released studio albums, extended plays (EPs), and individual singles are included. Data comes from 'Genius' (lyrics) and 'Spotify' (song characteristics). Additional functions are included for easily creating data visualizations with color palettes inspired by Taylor Swift's album covers.
Maintained by W. Jake Thompson. Last updated 2 months ago.
color-palettesdatagenius-lyricsggplot2-themeslyricsspotifyspotify-apitaylor-swift
45 stars 8.79 score 105 scriptsepiverse-trace
linelist:Tagging and Validating Epidemiological Data
Provides tools to help storing and handling case line list data. The 'linelist' class adds a tagging system to classical 'data.frame' objects to identify key epidemiological data such as dates of symptom onset, epidemiological case definition, age, gender or disease outcome. Once tagged, these variables can be seamlessly used in downstream analyses, making data pipelines more robust and reliable.
Maintained by Hugo Gruson. Last updated 3 days ago.
datadata-structuresepidemiologyepiverseoutbreakssdg-3structured-data
7 stars 8.69 score 61 scripts 2 dependentsropensci
ckanr:Client for the Comprehensive Knowledge Archive Network ('CKAN') API
Client for 'CKAN' API (<https://ckan.org/>). Includes interface to 'CKAN' 'APIs' for search, list, show for packages, organizations, and resources. In addition, provides an interface to the 'datastore' API.
Maintained by Francisco Alves. Last updated 2 years ago.
databaseopen-datackanapidatadatasetapi-wrapperckan-api
98 stars 8.60 score 448 scripts 4 dependentsrobjhyndman
fpp2:Data for "Forecasting: Principles and Practice" (2nd Edition)
All data sets required for the examples and exercises in the book "Forecasting: principles and practice" (2nd ed, 2018) by Rob J Hyndman and George Athanasopoulos <https://otexts.com/fpp2/>. All packages required to run the examples are also loaded.
Maintained by Rob Hyndman. Last updated 2 years ago.
106 stars 8.57 score 1.8k scripts 1 dependentsrobjhyndman
fpp3:Data for "Forecasting: Principles and Practice" (3rd Edition)
All data sets required for the examples and exercises in the book "Forecasting: principles and practice" by Rob J Hyndman and George Athanasopoulos <https://OTexts.com/fpp3/>. All packages required to run the examples are also loaded. Additional data sets not used in the book are also included.
Maintained by Rob Hyndman. Last updated 6 months ago.
142 stars 8.54 score 2.5k scriptsropensci
weatherOz:An API Client for Australian Weather and Climate Data Resources
Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
Maintained by Rodrigo Pires. Last updated 1 months ago.
dpirdbommeteorological-dataweather-forecastaustraliaweatherweather-datameteorologywestern-australiaaustralia-bureau-of-meteorologywestern-australia-agricultureaustralia-agricultureaustralia-climateaustralia-weatherapi-clientclimatedatarainfallweather-api
31 stars 8.47 score 40 scriptsropensci
parzer:Parse Messy Geographic Coordinates
Parse messy geographic coordinates from various character formats to decimal degree numeric values. Parse coordinates into their parts (degree, minutes, seconds); calculate hemisphere from coordinates; pull out individually degrees, minutes, or seconds; add and subtract degrees, minutes, and seconds. C++ code herein originally inspired from code written by Jeffrey D. Bogan, but then completely re-written.
Maintained by Alban Sagouis. Last updated 2 months ago.
geospatialdatalatitudelongitudeparsercoordinatesgeocpp
65 stars 8.45 score 162 scripts 3 dependentsropensci
hoardr:Manage Cached Files
Suite of tools for managing cached files, targeting use in other R packages. Uses 'rappdirs' for cross-platform paths. Provides utilities to manage cache directories, including targeting files by path or by key; cached directories can be compressed and uncompressed easily to save disk space.
Maintained by Tamás Stirling. Last updated 2 months ago.
25 stars 8.38 score 6 scripts 37 dependentschuhousen
amerifluxr:Interface to 'AmeriFlux' Data Services
Programmatic interface to the 'AmeriFlux' database (<https://ameriflux.lbl.gov/>). Provide query, download, and data summary tools.
Maintained by Housen Chu. Last updated 3 months ago.
amerifluxapicarbon-fluxdatatime-series
22 stars 8.36 score 29 scripts 15 dependentsoobianom
shinyStorePlus:Secure in-Browser and Database Storage for 'shiny' Inputs, Outputs, Views and User Likes
Store persistent and synchronized data from 'shiny' inputs within the browser. Refresh 'shiny' applications and preserve user-inputs over multiple sessions. A database-like storage format is implemented using 'Dexie.js' <https://dexie.org>, a minimal wrapper for 'IndexedDB'. Transfer browser link parameters to 'shiny' input or output values. Store app visitor views, likes and followers.
Maintained by Obinna Obianom. Last updated 1 months ago.
28 stars 8.29 score 93 scripts 1 dependentsropenspain
climaemet:Climate AEMET Tools
Tools to download the climatic data of the Spanish Meteorological Agency (AEMET) directly from R using their API and create scientific graphs (climate charts, trend analysis of climate time series, temperature and precipitation anomalies maps, warming stripes graphics, climatograms, etc.).
Maintained by Diego Hernangómez. Last updated 4 days ago.
aemetclimatedataforecast-apiropenspainsciencespainweather-api
42 stars 8.25 score 59 scriptsnceas
metajam:Easily Download Data and Metadata from 'DataONE'
A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (<https://www.dataone.org>) and easily importing this information into R.
Maintained by Julien Brun. Last updated 7 months ago.
datadata-analysismetadatarepositories
16 stars 8.21 score 75 scriptsropensci
ghql:General Purpose 'GraphQL' Client
A 'GraphQL' client, with an R6 interface for initializing a connection to a 'GraphQL' instance, and methods for constructing queries, including fragments and parameterized queries. Queries are checked with the 'libgraphqlparser' C++ parser via the 'graphql' package.
Maintained by Mark Padgham. Last updated 2 years ago.
httpapiweb-servicescurldatagraphqlgraphql-apigraphql-client
148 stars 8.12 score 111 scripts 5 dependentsbayes-rules
bayesrules:Datasets and Supplemental Functions from Bayes Rules! Book
Provides datasets and functions used for analysis and visualizations in the Bayes Rules! book (<https://www.bayesrulesbook.com>). The package contains a set of functions that summarize and plot Bayesian models from some conjugate families and another set of functions for evaluation of some Bayesian models.
Maintained by Mine Dogucu. Last updated 3 years ago.
72 stars 8.06 score 466 scriptsjennybc
repurrrsive:Examples of Recursive Lists and Nested or Split Data Frames
Recursive lists in the form of R objects, 'JSON', and 'XML', for use in teaching and examples. Examples include color palettes, Game of Thrones characters, 'GitHub' users and repositories, music collections, and entities from the Star Wars universe. Data from the 'gapminder' package is also included, as a simple data frame and in nested and split forms.
Maintained by Jennifer Bryan. Last updated 2 years ago.
134 stars 8.04 score 612 scriptsropenspain
spanishoddata:Get Spanish Origin-Destination Data
Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the 'duckdb' database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.
Maintained by Egor Kotov. Last updated 8 days ago.
cdrdatadata-packagemobile-telephone-datamobilityorigin-destination
35 stars 7.92 score 14 scriptsoobianom
quickcode:Quick and Essential 'R' Tricks for Better Scripts
The NOT functions, 'R' tricks and a compilation of some simple quick plus often used 'R' codes to improve your scripts. Improve the quality and reproducibility of 'R' scripts.
Maintained by Obinna Obianom. Last updated 26 days ago.
5 stars 7.76 score 7 scripts 6 dependentsropensci
ruODK:An R Client for the ODK Central API
Access and tidy up data from the 'ODK Central' API. 'ODK Central' is a clearinghouse for digitally captured data using ODK <https://docs.getodk.org/central-intro/>. It manages user accounts and permissions, stores form definitions, and allows data collection clients like 'ODK Collect' to connect to it for form download and submission upload. The 'ODK Central' API is documented at <https://docs.getodk.org/central-api/>.
Maintained by Florian W. Mayer. Last updated 5 months ago.
databaseopen-dataodkapidatadatasetodataodata-clientodk-centralopendatakit
42 stars 7.73 score 57 scripts 1 dependentsropensci
rdataretriever:R Interface to the Data Retriever
Provides an R interface to the Data Retriever <https://retriever.readthedocs.io/en/latest/> via the Data Retriever's command line interface. The Data Retriever automates the tasks of finding, downloading, and cleaning public datasets, and then stores them in a local database.
Maintained by Henry Senyondo. Last updated 8 months ago.
datadata-sciencedatabasedatasetsscience
46 stars 7.70 score 36 scriptserictleung
pixarfilms:Pixar Films and Achievements
Data about Disney Pixar films provided by Wikipedia. This package contains data about the films, the people involved, and their awards.
Maintained by Eric Leung. Last updated 5 days ago.
datadata-sciencedatapackagedisneyimdbimdb-datasetpixarpixar-filmsweb-scrapingwikipedia
20 stars 7.48 score 23 scripts 1 dependentsropensci
dataspice:Create Lightweight Schema.org Descriptions of Data
The goal of 'dataspice' is to make it easier for researchers to create basic, lightweight, and concise metadata files for their datasets. These basic files can then be used to make useful information available during analysis, create a helpful dataset "README" webpage, and produce more complex metadata formats to aid dataset discovery. Metadata fields are based on the 'Schema.org' and 'Ecological Metadata Language' standards.
Maintained by Bryce Mecum. Last updated 4 years ago.
datadatasetmetadataschema-orgunconfunconf18
162 stars 7.45 score 25 scriptscapitalone
dataCompareR:Compare Two Data Frames and Summarise the Difference
Easy comparison of two tabular data objects in R. Specifically designed to show differences between two sets of data in a useful way that should make it easier to understand the differences, and if necessary, help you work out how to remedy them. Aims to offer a more useful output than all.equal() when your two data sets do not match, but isn't intended to replace all.equal() as a way to test for equality.
Maintained by Sarah Johnston. Last updated 2 years ago.
compare-datadatadata-analysisdata-science
76 stars 7.24 score 76 scriptsropensci
bowerbird:Keep a Collection of Sparkly Data Resources
Tools to get and maintain a data repository from third-party data providers.
Maintained by Ben Raymond. Last updated 17 days ago.
ropensciantarcticsouthern oceandataenvironmentalsatelliteclimatepeer-reviewed
50 stars 7.16 score 16 scripts 1 dependentsropensci
rsnps:Get 'SNP' ('Single-Nucleotide' 'Polymorphism') Data on the Web
A programmatic interface to various 'SNP' 'datasets' on the web: 'OpenSNP' (<https://opensnp.org>), and 'NBCIs' 'dbSNP' database (<https://www.ncbi.nlm.nih.gov/projects/SNP/>). Functions are included for searching for 'NCBI'. For 'OpenSNP', functions are included for getting 'SNPs', and data for 'genotypes', 'phenotypes', annotations, and bulk downloads of data by user.
Maintained by Julia Gustavsen. Last updated 2 years ago.
genesnpsequenceapiwebapi-clientspeciesdbsnpopensnpncbigenotypedatasnpsweb-api
53 stars 7.08 score 63 scripts 1 dependentsedwindj
daff:Diff, Patch and Merge for Data.frames
Diff, patch and merge for data frames. Document changes in data sets and use them to apply patches. Changes to data can be made visible by using render_diff(). The 'V8' package is used to wrap the 'daff.js' 'JavaScript' library which is included in the package.
Maintained by Edwin de Jonge. Last updated 1 years ago.
152 stars 6.99 score 133 scriptsipeagit
flightsbr:Download Flight and Airport Data from Brazil
Download flight and airport data from Brazil’s Civil Aviation Agency (ANAC) <https://www.gov.br/anac/pt-br>. The data covers detailed information on aircraft, airports, and airport operations registered with ANAC. It also includes data on airfares, all international flights to and from Brazil, and domestic flights within the country.
Maintained by Rafael H. M. Pereira. Last updated 2 months ago.
41 stars 6.94 score 20 scriptsopenintrostat
usdata:Data on the States and Counties of the United States
Demographic data on the United States at the county and state levels spanning multiple years.
Maintained by Mine Çetinkaya-Rundel. Last updated 10 months ago.
9 stars 6.89 score 294 scripts 1 dependentsappsilon
data.validator:Automatic Data Validation and Reporting
Validate dataset by columns and rows using convenient predicates inspired by 'assertr' package. Generate good looking HTML report or print console output to display in logs of your data processing pipeline.
Maintained by Marcin Dubel. Last updated 12 months ago.
datareportingrhinoverserstudiovalidation
147 stars 6.67 score 40 scriptsrsquaredacademy
xplorerr:Tools for Interactive Data Exploration
Tools for interactive data exploration built using 'shiny'. Includes apps for descriptive statistics, visualizing probability distributions, inferential statistics, linear regression, logistic regression and RFM analysis.
Maintained by Aravind Hebbali. Last updated 5 months ago.
dataexplorationshiny-appsstatisticsvisualizationcpp
38 stars 6.62 score 11 scripts 6 dependentspdil
usmapdata:Mapping Data for 'usmap' Package
Provides a container for data used by the 'usmap' package. The data used by 'usmap' has been extracted into this package so that the file size of the 'usmap' package can be reduced greatly. The data in this package will be updated roughly once per year as new map data files are provided by the US Census Bureau.
Maintained by Paolo Di Lorenzo. Last updated 23 days ago.
countiesdatafipsmappingstatesusa
5 stars 6.59 score 35 scripts 3 dependentspoliticaargentina
geoAr:Argentina's Spatial Data Toolbox
Collection of tools that facilitates data access and workflow for spatial analysis of Argentina. Includes historical information from censuses, administrative limits at different levels of aggregation, location of human settlements, among others. Since it is expected that the majority of users will be Spanish-speaking, the documentation of the package prioritizes this language, although an effort is made to also offer annotations in English.
Maintained by Juan Pablo Ruiz Nicolini. Last updated 1 years ago.
15 stars 6.55 score 78 scriptsspsanderson
healthyR.data:Data Only Package to 'healthyR'
Provides data for functions typically used in the 'healthyR' package.
Maintained by Steven Sanderson. Last updated 2 months ago.
datadata-sciencedata-setshealthcarehealthcare-analysishealthcare-applicationhealthcare-datasets
10 stars 6.52 score 105 scripts 1 dependentskatilingban
ppitables:Lookup Tables to Generate Poverty Likelihoods and Rates using the Poverty Probability Index (PPI)
The Poverty Probability Index (PPI) is a poverty measurement tool for organizations and businesses with a mission to serve the poor. The PPI is statistically-sound, yet simple to use: the answers to 10 questions about a household's characteristics and asset ownership are scored to compute the likelihood that the household is living below the poverty line - or above by only a narrow margin. This package contains country-specific lookup data tables used as reference to determine the poverty likelihood of a household based on their score from the country-specific PPI questionnaire. These lookup tables have been extracted from documentation of the PPI found at <https://www.povertyindex.org> and managed by Innovations for Poverty Action <https://poverty-action.org/>.
Maintained by Ernest Guevarra. Last updated 16 days ago.
datapovertypoverty-likelihoodspoverty-probabilityppi
6 stars 6.43 score 89 scriptscmstatr
cmstatr:Statistical Methods for Composite Material Data
An implementation of the statistical methods commonly used for advanced composite materials in aerospace applications. This package focuses on calculating basis values (lower tolerance bounds) for material strength properties, as well as performing the associated diagnostic tests. This package provides functions for calculating basis values assuming several different distributions, as well as providing functions for non-parametric methods of computing basis values. Functions are also provided for testing the hypothesis that there is no difference between strength and modulus data from an alternate sample and that from a "qualification" or "baseline" sample. For a discussion of these statistical methods and their use, see the Composite Materials Handbook, Volume 1 (2012, ISBN: 978-0-7680-7811-4). Additional details about this package are available in the paper by Kloppenborg (2020, <doi:10.21105/joss.02265>).
Maintained by Stefan Kloppenborg. Last updated 10 days ago.
composite-material-datadatamaterials-sciencestatistical-analysisstatistics
4 stars 6.36 score 23 scriptsropensci
pangaear:Client for the 'Pangaea' Database
Tools to interact with the 'Pangaea' Database (<https://www.pangaea.de>), including functions for searching for data, fetching 'datasets' by 'dataset' 'ID', and working with the 'Pangaea' 'OAI-PMH' service.
Maintained by Scott Chamberlain. Last updated 2 years ago.
pangaeaenvironmental scienceearth sciencearchivepaleontologyecologychemistryatmosphereapi-clientdatapaleobiologyscientificwebservice-client
21 stars 6.27 score 29 scriptssaschagobel
legislatoR:Interface to the Comparative Legislators Database
Facilitates access to the Comparative Legislators Database (CLD). The CLD includes political, sociodemographic, career, online presence, public attention, and visual information for over 67,000 contemporary and historical politicians from 16 countries.
Maintained by Sascha Goebel. Last updated 1 years ago.
datadatasetlegislatorsparliamentpolitical-sciencepoliticianspoliticswikipedia
96 stars 6.24 score 36 scriptsbioc
rpx:R Interface to the ProteomeXchange Repository
The rpx package implements an interface to proteomics data submitted to the ProteomeXchange consortium.
Maintained by Laurent Gatto. Last updated 2 months ago.
immunooncologyproteomicsmassspectrometrydataimportthirdpartyclientbioconductordatamass-spectrometryproteomexchange
5 stars 6.20 score 21 scriptsjrdnbradford
readMDTable:Read Markdown Tables into Tibbles
Efficient reading of raw markdown tables into tibbles. Designed to accept content from strings, files, and URLs with the ability to extract and read multiple tables from markdown for analysis.
Maintained by Jordan Bradford. Last updated 2 months ago.
datadata-analysisdata-analyticsdata-extractiondata-miningdata-sciencemarkdownmarkdown-parsermarkdown-tabler-programming
7 stars 6.10 score 3 scripts 1 dependentsivan-rivera
RedditExtractoR:Reddit Data Extraction Toolkit
A collection of tools for extracting structured data from <https://www.reddit.com/>.
Maintained by Ivan Rivera. Last updated 2 years ago.
93 stars 6.02 score 153 scriptspepijn-devries
CopernicusMarine:Search Download and Handle Data from Copernicus Marine Service Information
Subset and download data from EU Copernicus Marine Service Information: <https://data.marine.copernicus.eu>. Import data on the oceans physical and biogeochemical state from Copernicus into R without the need of external software.
Maintained by Pepijn de Vries. Last updated 3 months ago.
29 stars 5.94 score 20 scripts 2 dependentsdickoa
robotoolbox:Client for the 'KoboToolbox' API
Suite of utilities for accessing and manipulating data from the 'KoboToolbox' API. 'KoboToolbox' is a robust platform designed for field data collection in various disciplines. This package aims to simplify the process of fetching and handling data from the API. Detailed documentation for the 'KoboToolbox' API can be found at <https://support.kobotoolbox.org/api.html>.
Maintained by Ahmadou Dicko. Last updated 3 months ago.
open-datakobotoolboxodkkpiapidatadataset
5.86 score 48 scriptsineelhere
clintrialx:Connect and Work with Clinical Trials Data Sources
Are you spending too much time fetching and managing clinical trial data? Struggling with complex queries and bulk data extraction? What if you could simplify this process with just a few lines of code? Introducing 'clintrialx' - Fetch clinical trial data from sources like 'ClinicalTrials.gov' <https://clinicaltrials.gov/> and the 'Clinical Trials Transformation Initiative - Access to Aggregate Content of ClinicalTrials.gov' database <https://aact.ctti-clinicaltrials.org/>, supporting pagination and bulk downloads. Also, you can generate HTML reports based on the data obtained from the sources!
Maintained by Indraneel Chakraborty. Last updated 17 days ago.
aactbioinformaticsclinical-dataclinical-trialsclinicaltrialsgovcttidatadata-managementmedical-informaticsr-languagetrials
15 stars 5.76 score 11 scriptsbradleyboehmke
completejourney:Retail Shopping Data
Retail shopping transactions for 2,469 households over one year. Originates from the 84.51° Complete Journey 2.0 source files <https://www.8451.com/area51> which also includes useful metadata on products, coupons, campaigns, and promotions.
Maintained by Brad Boehmke. Last updated 5 years ago.
21 stars 5.70 score 42 scriptsepiforecasts
covidregionaldata:Subnational Data for COVID-19 Epidemiology
An interface to subnational and national level COVID-19 data sourced from both official sources, such as Public Health England in the UK, and from other COVID-19 data collections, including the World Health Organisation (WHO), European Centre for Disease Prevention and Control (ECDC), John Hopkins University (JHU), Google Open Data and others. Designed to streamline COVID-19 data extraction, cleaning, and processing from a range of data sources in an open and transparent way. This allows users to inspect and scrutinise the data, and tools used to process it, at every step. For all countries supported, data includes a daily time-series of cases. Wherever available data is also provided for deaths, hospitalisations, and tests. National level data are also supported using a range of sources.
Maintained by Sam Abbott. Last updated 3 years ago.
covid-19dataopen-sciencer6regional-data
37 stars 5.67 score 121 scriptsropensci
rdryad:Access for Dryad Web Services
Interface to the Dryad "Solr" API, their "OAI-PMH" service, and fetch datasets. Dryad (<https://datadryad.org/>) is a curated host of data underlying scientific publications.
Maintained by Scott Chamberlain. Last updated 2 years ago.
26 stars 5.64 score 48 scriptsramikrispin
USgas:The Demand for Natural Gas in the US
Provides an overview of the demand for natural gas in the US by state and country level. Data source: US Energy Information Administration <https://www.eia.gov/>.
Maintained by Rami Krispin. Last updated 2 years ago.
9 stars 5.56 score 41 scriptspiersyork
owidR:Import Data from Our World in Data
Import data from 'Our World in Data', an organisation which publishes research and data on global economic and social issues.
Maintained by Piers York. Last updated 1 years ago.
datadata-visualisationeconomics
117 stars 5.49 score 53 scriptsleonawicz
rtrek:Data Analysis Relating to Star Trek
Provides datasets related to the Star Trek fictional universe and functions for working with the data. The package also provides access to real world datasets based on the televised series and other related licensed media productions. It interfaces with the Star Trek API (STAPI) (<http://stapi.co/>), Memory Alpha (<https://memory-alpha.fandom.com/wiki/Portal:Main>), and Memory Beta (<https://memory-beta.fandom.com/wiki/Main_Page>) to retrieve data, metadata and other information relating to Star Trek. It also contains several local datasets covering a variety of topics. The package also provides functions for working with data from other Star Trek-related R data packages containing larger datasets not stored in 'rtrek'.
Maintained by Matthew Leonawicz. Last updated 7 months ago.
54 stars 5.42 score 49 scriptsnareal
frenchdata:Download Data Sets from Kenneth's French Finance Data Library Site
Download data sets from Kenneth's French finance data library site <http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html>, reads all the data subsets from the file. Allows R users to collect the data as 'tidyverse'-ready data frames.
Maintained by Nelson Areal. Last updated 1 years ago.
12 stars 5.40 score 42 scriptsmlr-org
mlr3oml:Connector Between 'mlr3' and 'OpenML'
Provides an interface to 'OpenML.org' to list and download machine learning data, tasks and experiments. The 'OpenML' objects can be automatically converted to 'mlr3' objects. For a more sophisticated interface with more upload options, see the 'OpenML' package.
Maintained by Sebastian Fischer. Last updated 10 months ago.
datadata-sciencedatasetsmachine-learningmlr3openmlcpp
7 stars 5.37 score 105 scriptsmlr-org
mlr3data:Collection of Machine Learning Data Sets for 'mlr3'
A small collection of interesting and educational machine learning data sets which are used as examples in the 'mlr3' book (<https://mlr3book.mlr-org.com>), the use case gallery (<https://mlr3gallery.mlr-org.com>), or in other examples. All data sets are properly preprocessed and ready to be analyzed by most machine learning algorithms. Data sets are automatically added to the dictionary of tasks if 'mlr3' is loaded.
Maintained by Marc Becker. Last updated 5 months ago.
datadata-sciencedata-setsmachine-learningmlr3
2 stars 5.28 score 18 scripts 2 dependentsropensci
dataaimsr:AIMS Data Platform API Client
AIMS Data Platform API Client which provides easy access to AIMS Data Platform scientific data and information.
Maintained by Diego R. Barneche. Last updated 2 years ago.
aimsaustraliadatamarinemonitoringsstweather
4 stars 5.11 score 54 scriptsopenintrostat
cherryblossom:Cherry Blossom Run Race Results
Race results of the Cherry Blossom Run, which is an annual road race that takes place in Washington, DC.
Maintained by Mine Çetinkaya-Rundel. Last updated 1 years ago.
6 stars 5.11 score 28 scripts 1 dependentsmine-cetinkaya-rundel
ukbabynames:UK Baby Names Data
Full listing of UK baby names occurring more than three times per year between 1974 and 2020, and rankings of baby name popularity by decade from 1904 to 1994.
Maintained by Mine Çetinkaya-Rundel. Last updated 3 years ago.
21 stars 5.03 score 34 scriptsdalekube
hR:Better Data Engineering in Human Resources
Methods for data engineering in the human resources (HR) corporate domain. Designed for HR analytics practitioners and workforce-oriented data sets.
Maintained by Dale Kube. Last updated 13 days ago.
analyticsdatadata-engineeringdata-sciencehuman-resources
21 stars 5.02 score 8 scriptspanukatan
bagyo:Philippine Tropical Cyclones Data
The Philippines frequently experiences tropical cyclones (called 'bagyo' in the Filipino language) because of its geographical position. These cyclones typically bring heavy rainfall, leading to widespread flooding, as well as strong winds that cause significant damage to human life, crops, and property. Data on cyclones are collected and curated by the Philippine Atmospheric, Geophysical, and Astronomical Services Administration or 'PAGASA' and made available through its website <https://bagong.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report>. This package contains Philippine tropical cyclones data in a machine-readable format. It is hoped that this data package provides an interesting and unique dataset for data exploration and visualisation.
Maintained by Ernest Guevarra. Last updated 8 months ago.
2 stars 5.00 score 6 scriptsropensci
rdatacite:Client for the 'DataCite' API
Client for the web service methods provided by 'DataCite' (<https://www.datacite.org/>), including functions to interface with their 'RESTful' search API. The API is backed by 'Elasticsearch', allowing expressive queries, including faceting.
Maintained by Bianca Kramer. Last updated 2 years ago.
datascholarlydatasethttpsapiweb-servicesapi-wrapperdataciteidentifiermetadataoai-pmhsolr
25 stars 4.99 score 26 scriptsdenironyx
overturemapsr:Download Overture Maps Data in R
Overture Maps offers free and open geospatial map data sourced from various providers and standardized to a common schema. This tool allows you to download Overture Maps data for a specific region of interest and convert it to several different file formats. For more information, visit <https://overturemaps.org/download/>.
Maintained by Dennis Irorere. Last updated 1 months ago.
datageospatiallocationopendataosmosmdataoverturemaps
16 stars 4.83 score 14 scriptsleeper
csvy:Import and Export CSV Data with a YAML Metadata Header
Support for import from and export to the CSVY file format. CSVY is a file format that combines the simplicity of CSV (comma-separated values) with the metadata of other plain text and binary formats (JSON, XML, Stata, etc.) by placing a YAML header on top of a regular CSV.
Maintained by Thomas J. Leeper. Last updated 7 years ago.
59 stars 4.77 score 7 scriptsvalerivoev
danstat:R Client for the Statistics Denmark Databank API
The purpose of the package is to enable an R function interface into the Statistics Denmark Databank API mainly for research purposes. The Statistics Denmark Databank API has four endpoints, see here for more information and testing the API in their console: <https://www.dst.dk/en/Statistik/brug-statistikken/muligheder-i-statistikbanken/api>. This package mimics the structure of the API and provides four main functions to match the functionality of the API endpoints.
Maintained by Valeri Voev. Last updated 3 years ago.
6 stars 4.73 score 18 scriptsaberhrml
metaboData:Example Metabolomics Data Sets
Data sets from a variety of biological sample matrices, analysed using a number of mass spectrometry based metabolomic analytical techniques. The example data sets are stored remotely using GitHub releases <https://github.com/aberHRML/metaboData/releases> which can be accessed from R using the package. The package also includes the 'abr1' FIE-MS data set from the 'FIEmspro' package <https://users.aber.ac.uk/jhd/> <doi:10.1038/nprot.2007.511>.
Maintained by Jasen Finch. Last updated 3 years ago.
datadatasetsfie-hrmsfie-msmass-spectrometrymetabolomics
1 stars 4.72 score 104 scriptsrobjhyndman
tscompdata:Time series data from various forecasting competitions
Time series data from the following forecasting competitions are provided: M, M3, NN3, NN5, NNGC1, Tourism, and GEFCom2012.
Maintained by Rob Hyndman. Last updated 2 years ago.
18 stars 4.69 score 18 scriptsglobalgov
manydata:A Portal for Global Governance Data
This is the core package for the many packages universe. It includes functions to help researchers work with and contribute to event datasets on global governance.
Maintained by James Hollway. Last updated 8 days ago.
9 stars 4.65 scoremps9506
rATTAINS:Access EPA 'ATTAINS' Data
An R interface to United States Environmental Protection Agency (EPA) Assessment, Total Maximum Daily Load (TMDL) Tracking and Implementation System ('ATTAINS') data. 'ATTAINS' is the EPA database used to track information provided by states about water quality assessments conducted under federal Clean Water Act requirements. ATTAINS information and API information is available at <https://www.epa.gov/waterdata/attains>.
Maintained by Michael Schramm. Last updated 2 years ago.
8 stars 4.60 score 10 scriptsnik01010
openbankeR:R Client for Querying the UK 'Open Banking' ('Open Data') API
Creates a client with queries for the UK 'Open Banking' ('Open Data') API. API wrapper around <https://openbankinguk.github.io/opendata-api-docs-pub>.
Maintained by Nik Lilovski. Last updated 3 years ago.
bankclientdataopenbankingopenbanking-apiopendataopendata-api
6 stars 4.48 score 6 scriptsgavinrozzi
njtr1:Download, Analyze & Clean New Jersey Car Crash Data
Download and analyze motor vehicle crash data released by the New Jersey Department of Transportation (NJDOT). The data in this package is collected through the filing of NJTR-1 form by police officers, which provide a standardized way of documenting a motor vehicle crash that occurred in New Jersey. 3 different data tables containing data on crashes, vehicles & pedestrians released from 2001 to the present can be downloaded & cleaned using this package.
Maintained by Gavin Rozzi. Last updated 1 years ago.
njtr1new-jerseyroad-safetycar-crashescar-accidentsdata
5 stars 4.40 score 7 scriptsopenintrostat
airports:Data on Airports
Geographic, use, and property related data on airports.
Maintained by Mine Çetinkaya-Rundel. Last updated 3 years ago.
2 stars 4.39 score 15 scripts 1 dependentsmhashemihsmw
MLMOI:Estimating Frequencies, Prevalence and Multiplicity of Infection
The implemented methods reach out to scientists that seek to estimate multiplicity of infection (MOI) and lineage (allele) frequencies and prevalences at molecular markers using the maximum-likelihood method described in Schneider (2018) <doi:10.1371/journal.pone.0194148>, and Schneider and Escalante (2014) <doi:10.1371/journal.pone.0097899>. Users can import data from Excel files in various formats, and perform maximum-likelihood estimation on the imported data by the package's moimle() function.
Maintained by Meraj Hashemi. Last updated 1 years ago.
datadata-visualizationdataanalysisdatapreprocessingdatawranglingstatistical-models
4.30 score 2 scriptstmsalab
edmdata:Data Sets for Psychometric Modeling
Collection of data sets from various assessments that can be used to evaluate psychometric models. These data sets have been analyzed in the following papers that introduced new methodology as part of the application section: Jimenez, A., Balamuta, J. J., & Culpepper, S. A. (2023) <doi:10.1111/bmsp.12307>, Culpepper, S. A., & Balamuta, J. J. (2021) <doi:10.1080/00273171.2021.1985949>, Yinghan Chen et al. (2021) <doi:10.1007/s11336-021-09750-9>, Yinyin Chen et al. (2020) <doi:10.1007/s11336-019-09693-2>, Culpepper, S. A. (2019a) <doi:10.1007/s11336-019-09683-4>, Culpepper, S. A. (2019b) <doi:10.1007/s11336-018-9643-8>, Culpepper, S. A., & Chen, Y. (2019) <doi:10.3102/1076998618791306>, Culpepper, S. A., & Balamuta, J. J. (2017) <doi:10.1007/s11336-015-9484-7>, and Culpepper, S. A. (2015) <doi:10.3102/1076998615595403>.
Maintained by James Joseph Balamuta. Last updated 6 months ago.
cognitive-diagnostic-modelsdataedm
5 stars 4.18 score 7 scripts 1 dependentscleanzr
cd:CD Data for Entity Resolution
Duplicated music data (pre-processed and formatted) for entity resolution. The total size of the data set is 9763. There are respective gold standard records that are labeled and can be considered as a unique identifier.
Maintained by Rebecca Steorts. Last updated 7 years ago.
4.16 score 29 scriptsflavioleccese92
euroleaguer:Euroleague and Eurocup basketball API
Unofficial API wrapper for 'Euroleague' and 'Eurocup' basketball API (<https://www.euroleaguebasketball.net/en/euroleague/>), it allows to retrieve real-time and historical standard and advanced statistics about competitions, teams, players and games.
Maintained by Flavio Leccese. Last updated 3 months ago.
analyticsbasketballdatadata-sciencelibrary
7 stars 4.15 score 7 scriptsglobeandmail
upstartr:Utilities Powering the Globe and Mail's Data Journalism Template
Core functions necessary for using The Globe and Mail's R data journalism template, 'startr', along with utilities for day-to-day data journalism tasks, such as reading and writing files, producing graphics and cleaning up datasets.
Maintained by Tom Cardoso. Last updated 1 years ago.
datadata-analysisdata-journalismdata-visualizationjournalismnews
6 stars 4.14 score 46 scriptsryapric
readit:Effortlessly Read Any Rectangular Data
Providing just one primary function, 'readit' uses a set of reasonable heuristics to apply the appropriate reader function to the given file path. As long as the data file has an extension, and the data is (or can be coerced to be) rectangular, readit() can probably read it.
Maintained by Ryan Price. Last updated 5 years ago.
24 stars 4.12 score 11 scriptsdenchpokepon
fedstatAPIr:Unofficial API for Fedstat (Rosstat EMISS System) for Automatic and Efficient Data Queries
An API for automatic data queries to the fedstat <https://www.fedstat.ru>, using a small set of functions with a common interface.
Maintained by Denis Krylov. Last updated 5 months ago.
23 stars 4.06 score 5 scriptspoliticaargentina
opinAr:Argentina's Public Opinion Toolbox
A toolbox for working with public opinion data from Argentina. It facilitates access to microdata and the calculation of indicators of the Trust in Government Index (ICG), prepared by the Torcuato Di Tella University. Although we will try to document everything possible in English, by its very nature Spanish will be the main language. El paquete fue pensado como una caja de herramientas para el trabajo con datos de opinión pública de Argentina. El mismo facilita el acceso a los microdatos y el cálculos de indicadores del Índice de Confianza en el Gobierno (ICG), elaborado por la Universidad Torcuato Di Tella.
Maintained by Juan Pablo Ruiz Nicolini. Last updated 1 years ago.
argentinadatapolitical-sciencepoliticspublic-opinion
4.00 scoreropenspain
infoelectoral:Download Spanish Election Results
Download official election results for Spain at polling station, municipality and province level from the Ministry of Interior (<https://infoelectoral.interior.gob.es/es/elecciones-celebradas/area-de-descargas/>), format them and import them to the R environment.
Maintained by Héctor Meleiro. Last updated 7 months ago.
dataeleccioneselectionselectoralinfoelectoralspain
31 stars 3.97 score 9 scriptscoatless-rpkg
ucimlrepo:Explore UCI ML Repository Datasets
Find and import datasets from the University of California Irvine Machine Learning (UCI ML) Repository into R. Supports working with data from UCI ML repository inside of R scripts, notebooks, and 'Quarto'/'RMarkdown' documents. Access the UCI ML repository directly at <https://archive.ics.uci.edu/>.
Maintained by James Joseph Balamuta. Last updated 7 months ago.
datadata-sciencemachine-learningstatisticsuci-machine-learninguci-machine-learning-repositoryweb-api
6 stars 3.95 score 4 scriptsdanilofreire
prisonbrief:Downloads and Parses World Prison Brief Data
Download, parses and tidies information from the World Prison Brief project <http://www.prisonstudies.org/>.
Maintained by Danilo Freire. Last updated 4 years ago.
18 stars 3.95 score 8 scriptsglobeandmail
tgamtheme:Globe and Mail Graphics Theme for 'ggplot2'
Theme and colour palettes for The Globe and Mail's graphics. Includes colour and fill scale functions, colour palette helpers and a Globe-styled 'ggplot2' theme object.
Maintained by Tom Cardoso. Last updated 4 years ago.
datadata-journalismdata-visualizationjournalismnews
6 stars 3.95 score 2 scripts 1 dependentsyelleknek
AMCP:A Model Comparison Perspective
Accompanies "Designing experiments and analyzing data: A model comparison perspective" (3rd ed.) by Maxwell, Delaney, & Kelley (2018; Routledge). Contains all of the data sets in the book's chapters and end-of-chapter exercises. Information about the book is available at <http://www.DesigningExperiments.com>.
Maintained by Ken Kelley. Last updated 5 years ago.
3.91 score 162 scriptsm-muecke
isocountry:ISO 3166-1 Country Codes
ISO 3166-1 country codes and ISO 4217 currency codes provided by the International Organization for Standardization.
Maintained by Maximilian Mücke. Last updated 11 days ago.
country-codescurrency-codesdataiso-3166-1iso-4217
1 stars 3.90 score 2 scriptsbioc
MsDataHub:Mass Spectrometry Data on ExperimentHub
The MsDataHub package uses the ExperimentHub infrastructure to distribute raw mass spectrometry data files, peptide spectrum matches or quantitative data from proteomics and metabolomics experiments.
Maintained by Laurent Gatto. Last updated 1 months ago.
experimenthubsoftwaremassspectrometryproteomicsmetabolomicsbioconductordatamass-spectrometry
1 stars 3.88 score 2 scriptsbradlindblad
cheatsheet:Download R Cheat Sheets Locally
A simple package to grab cheat sheets and save them to your local computer.
Maintained by Brad Lindblad. Last updated 2 years ago.
11 stars 3.74 score 5 scriptsflrd
standardlastprofile:Data Package for BDEW Standard Load Profiles in Electricity
Data on standard load profiles from the German Association of Energy and Water Industries (BDEW Bundesverband der Energie- und Wasserwirtschaft e.V.) in a tidy format. The data and methodology are described in VDEW (1999), "Repräsentative VDEW-Lastprofile", <https://www.bdew.de/media/documents/1999_Repraesentative-VDEW-Lastprofile.pdf>. The package also offers an interface for generating a standard load profile over a user-defined period. For the algorithm, see VDEW (2000), "Anwendung der Repräsentativen VDEW-Lastprofile step-by-step", <https://www.bdew.de/media/documents/2000131_Anwendung-repraesentativen_Lastprofile-Step-by-step.pdf>.
Maintained by Markus Döring. Last updated 8 months ago.
1 stars 3.70 score 4 scriptszcebeci
odetector:Outlier Detection Using Partitioning Clustering Algorithms
An object is called "outlier" if it remarkably deviates from the other objects in a data set. Outlier detection is the process to find outliers by using the methods that are based on distance measures, clustering and spatial methods (Ben-Gal, 2005 <ISBN 0-387-24435-2>). It is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition to its use for outlier removing in data processing. This package provides the implementations of some novel approaches to detect the outliers based on typicality degrees that are obtained with the soft partitioning clustering algorithms such as Fuzzy C-means and its variants.
Maintained by Zeynel Cebeci. Last updated 2 years ago.
anomaly-detectioncluster-analysisclusteringclustering-methodsdatadatapreparationdatapreprocessingexception-handlingfcmfraud-detectionfuzzy-clusteringnovelty-detectionoutlier-detectionoutlier-removaloutlierspartitioningpcmsurprise-exploration
3.70 score 4 scriptsmdlincoln
europop:Historical Populations of European Cities, 1500-1800
This dataset contains population estimates of all European cities with at least 10,000 inhabitants during the period 1500-1800. These data are adapted from Jan De Vries, "European Urbanization, 1500-1800" (1984).
Maintained by Matthew Lincoln. Last updated 8 years ago.
9 stars 3.69 score 11 scriptseconomic
realtalk:Price index data for the US economy
Makes it easy to use US price index data like the CPI.
Maintained by Ben Zipperer. Last updated 17 days ago.
5 stars 3.51 score 10 scriptsmightymetrika
holi:Higher Order Likelihood Inference Web Applications
Higher order likelihood inference is a promising approach for analyzing small sample size data. The 'holi' package provides web applications for higher order likelihood inference. It currently supports linear, logistic, and Poisson generalized linear models through the rstar_glm() function, based on Pierce and Bellio (2017) <doi:10.1111/insr.12232> and 'likelihoodAsy'. The package offers two main features: LA_rstar(), which launches an interactive 'shiny' application allowing users to fit models with rstar_glm() through their web browser, and sim_rstar_glm_pgsql(), which streamlines the process of launching a web-based 'shiny' simulation application that saves results to a user-created 'PostgreSQL' database.
Maintained by Mackson Ncube. Last updated 7 months ago.
3.48 score 5 scriptsnononoexe
setariaviridis:Field-Collected Data of Green Foxtail
Setaria viridis (green foxtail) is a common weed. This package contains measurements from individual branches of a wild Setaria viridis plant collected near the author's home. The data is intended for use in data analysis practice.
Maintained by Keisuke Ando. Last updated 2 months ago.
2 stars 3.48 score 2 scriptscleanzr
cora:Cora Data for Entity Resolution
Duplicated publication data (pre-processed and formatted) for entity resolution. This data set contains a total of 1879 records. The following variables are included in the data set: id, title, book title, authors, address, date, year, editor, journal, volume, pages, publisher, institution, type, tech, note. The data set has a respective gold data set that provides information on which records match based on id.
Maintained by Rebecca Steorts. Last updated 5 years ago.
3 stars 3.35 score 15 scriptsmightymetrika
mmirestriktor:Informative Hypothesis Testing Web Applications
Offering enhanced statistical power compared to traditional hypothesis testing methods, informative hypothesis testing allows researchers to explicitly model their expectations regarding the relationships among parameters. An important software tool for this framework is 'restriktor'. The 'mmirestriktor' package provides 'shiny' web applications to implement some of the basic functionality of 'restriktor'. The mmirestriktor() function launches a 'shiny' application for fitting and analyzing models with constraints. The FbarCards() function launches a card game application which can help build intuition about informative hypothesis testing. The iht_interpreter() helps interpret informative hypothesis testing results based on guidelines in Vanbrabant and Rosseel (2020) <doi:10.4324/9780429273872-14>.
Maintained by Mackson Ncube. Last updated 8 months ago.
datahypothesisinfomativepowerrestriktorstatisticstesting
3.30 score 5 scriptshrlai
novelforestSG:Dataset from the Novel Forests of Singapore
The raw dataset and model used in Lai et al. (2021) Decoupled responses of native and exotic tree diversities to distance from old-growth forest and soil phosphorous in novel secondary forests. Applied Vegetation Science, 24, e12548.
Maintained by Hao Ran Lai. Last updated 1 years ago.
datadiversityecologyforestsingapore
2 stars 3.30 score 4 scriptscarlosyanez
auscensus:Access Australian Census Data (2006-2021)
R package to interact with Australian Census Data Packs,providing an interface to extract data across multiple censuses.
Maintained by Carlos Yáñez Santibáñez. Last updated 6 months ago.
1 stars 3.18 score 9 scriptspetzi53
repairData:Open Repair Alliance Datasets 2021
The complete data set of open repair data, full compliant with the Open Repair Data Standards (ORDS). It combines the datasets contributed by partner organizations of the Open Repair Alliance (ORA). Last updated: 2021-02-22. The package also contains via quests enriched datasets on batteries, printers, mobiles, and tablets.
Maintained by Peter Baumgartner. Last updated 3 years ago.
dataopen-dataopen-datasetsrepairrepair-caferepairs
2.70 score 1 scriptscleanzr
italy:The Italian Survey on Household and Wealth, 2008 and 2010
Provides two record linkage data sets on the Italian Survey on Household and Wealth, 2008 and 2010, a sample survey conducted by the Bank of Italy every two years. The 2010 survey covered 13,702 individuals, while the 2008 survey covered 13,734 individuals. The following categorical variables are included in this data set: year of birth, working status, employment status, branch of activity, town size, geographical area of birth, sex, whether or not Italian national, and highest educational level obtained. Unique identifiers are available to assess the accuracy of one’s method. Please see Steorts (2015) <DOI:10.1214/15-BA965SI> to find more details about the data set.
Maintained by Rebecca Steorts. Last updated 8 years ago.
2.70 score 10 scriptsropengov
kelaopendata:Access open data from National social insurance institution of Finland
Designed to simplify and speed up access to open data from National social insurance institution of Finland (KELA) published at <https://www.avoindata.fi/data/fi/organization/kela>, the kelaopendata package offers researchers and analysts a set of tools to obtain data and metadata for a wide range of applications.
Maintained by Markus Kainu. Last updated 4 months ago.
dataopen-datasocial-security-data
1 stars 2.40 scorepoissonconsulting
klexdatr:Kootenay Lake Exploitation Study Data
Six relational 'tibbles' from the Kootenay Lake Large Trout Exploitation study. The study which ran from 2008 to 2014 caught, tagged and released large Rainbow Trout and Bull Trout in Kootenay Lake by boat angling. The fish were tagged with internal acoustic tags and/or high reward external tags and subsequently detected by an acoustic receiver array as well as reported by anglers. The data are analysed by Thorley and Andrusak (1994) <doi:10.7717/peerj.2874> to estimate the natural and fishing mortality of both species.
Maintained by Joe Thorley. Last updated 2 months ago.
2.30 score 7 scriptse-kotov
mapineqr:Access Mapineq Inequality Indicators via API
Access Mapineq inequality indicators via API.
Maintained by Egor Kotov. Last updated 1 months ago.
datademogrpahysocio-economic-indicators
1 stars 2.30 score 3 scriptscleanzr
restaurant:Restaurant Data for Entity Resolution
Duplicated restaurant data (pre-processed and formatted) for entity resolution. This package contains formatted data from a data set that contains information about different restaurants, with the Zagats portion containing 331 records and the Fodors portion containing 533 records. The following variables are included in the data set: id, name, address, city, phone, type. The data set has a respective gold data set that provides information on which records match based on id.
Maintained by Rebecca Steorts. Last updated 7 years ago.
1 stars 2.00 scorepoliticaargentina
legislAr:Argentina Legislative Data and Tools
Paquete que failicia el acceso y trabajo con datos legislativos de Argetnina. Basados en repositorio del proyecto "La Década Votada" (Andy Tow).
Maintained by Juan Pablo Ruiz Nicolini. Last updated 3 years ago.
argentinadatalegislative-datapolitical-science
2 stars 2.00 scoree-kotov
alboFr:Get French Data on Tiger Mosquito Colonisation
Get French Data on Tiger Mosquito (Aedes Albopictus) colonisation in France from the online map at <https://signalement-moustique.anses.fr/signalement_albopictus/colonisees>.
Maintained by Egor Kotov. Last updated 1 months ago.
aedes-albopictusdatafrancetiger-mosquito
1.70 score 3 scriptsgk-crop
sunscanimport:Imports data from sunscan device
Provides functions to import, convert and visualize LAI measurements from Sunscan device. An interactive shiny app is included.
Maintained by Gunther Krauss. Last updated 2 months ago.
1.70 scorespatialworks
liberia:Datasets for Use in Designing Surveys in Liberia
Designing surveys require relevant datasets to be used as basis for sample size calculations, sampling design, survey planning/logistics and survey implementation. These include datasets on population, lists of sampling clusters, map datasets for spatial sampling, and previous survey datasets that can be used for estimating indicator variance and design effects. This package contains relevant datasets for use in designing surveys in Liberia.
Maintained by Ernest Guevarra. Last updated 3 years ago.
1 stars 1.70 score 1 scripts