Showing 200 of total 1947 results (show query)
tidyverse
tidyverse:Easily Install and Load the 'Tidyverse'
The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at <https://www.tidyverse.org>.
Maintained by Hadley Wickham. Last updated 5 months ago.
1.7k stars 20.23 score 664k scripts 125 dependentssfirke
janitor:Simple Tools for Examining and Cleaning Dirty Data
The main janitor functions can: perfectly format data.frame column names; provide quick counts of variable combinations (i.e., frequency tables and crosstabs); and explore duplicate records. Other janitor functions nicely format the tabulation results. These tabulate-and-report functions approximate popular features of SPSS and Microsoft Excel. This package follows the principles of the "tidyverse" and works well with the pipe function %>%. janitor was built with beginning-to-intermediate R users in mind and is optimized for user-friendliness.
Maintained by Sam Firke. Last updated 3 months ago.
data-analysisdata-cleaningdata-sciencedirty-dataexcelpivot-tablesspsstabulationstidyverse
1.4k stars 19.40 score 35k scripts 231 dependentstopepo
caret:Classification and Regression Training
Misc functions for training and plotting classification and regression models.
Maintained by Max Kuhn. Last updated 4 months ago.
1.6k stars 19.24 score 61k scripts 303 dependentstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 3 hours ago.
586 stars 18.79 score 7.2k scripts 381 dependentstidymodels
tidymodels:Easily Install and Load the 'Tidymodels' Packages
The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
Maintained by Max Kuhn. Last updated 1 months ago.
783 stars 16.52 score 66k scripts 15 dependentsprophet:Automatic Forecasting Procedure
Implements a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
Maintained by Sean Taylor. Last updated 5 months ago.
19k stars 15.59 score 976 scripts 13 dependentsr-dbi
RPostgres:C++ Interface to PostgreSQL
Fully DBI-compliant C++-backed interface to PostgreSQL <https://www.postgresql.org/>, an open-source relational database.
Maintained by Kirill Mรผller. Last updated 1 months ago.
338 stars 14.78 score 1.6k scripts 31 dependentsdcomtois
summarytools:Tools to Quickly and Neatly Summarize Data
Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.
Maintained by Dominic Comtois. Last updated 1 hours ago.
descriptive-statisticsfrequency-tablehtml-reportmarkdownpanderpandocpandoc-markdownrmarkdownrstudio
528 stars 14.69 score 2.9k scripts 6 dependentsropensci
osmdata:Import 'OpenStreetMap' Data as Simple Features or Spatial Objects
Download and import of 'OpenStreetMap' ('OSM') data as 'sf' or 'sp' objects. 'OSM' data are extracted from the 'Overpass' web server (<https://overpass-api.de/>) and processed with very fast 'C++' routines for return to 'R'.
Maintained by Mark Padgham. Last updated 2 months ago.
open0street0mapopenstreetmapoverpass0apiosmcpposm-dataoverpass-apipeer-reviewedcpp
322 stars 14.53 score 2.8k scripts 14 dependentstidyverts
tsibble:Tidy Temporal Data Frames and Tools
Provides a 'tbl_ts' class (the 'tsibble') for temporal data in an data- and model-oriented format. The 'tsibble' provides tools to easily manipulate and analyse temporal data, such as filling in time gaps and aggregating over calendar periods.
Maintained by Earo Wang. Last updated 2 months ago.
536 stars 14.48 score 4.4k scripts 42 dependentstidymodels
tune:Tidy Tuning Tools
The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps.
Maintained by Max Kuhn. Last updated 28 days ago.
293 stars 14.27 score 756 scripts 39 dependentsbusiness-science
timetk:A Tool Kit for Working with Time Series
Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.
Maintained by Matt Dancho. Last updated 1 years ago.
coercioncoercion-functionsdata-miningdplyrforecastforecastingforecasting-modelsmachine-learningseries-decompositionseries-signaturetibbletidytidyquanttidyversetimetime-seriestimeseries
626 stars 14.20 score 4.0k scripts 16 dependentsdoi-usgs
dataRetrieval:Retrieval Functions for USGS and EPA Hydrology and Water Quality Data
Collection of functions to help retrieve U.S. Geological Survey and U.S. Environmental Protection Agency water quality and hydrology data from web services. Data are discovered from National Water Information System <https://waterservices.usgs.gov/> and <https://waterdata.usgs.gov/nwis>. Water quality data are obtained from the Water Quality Portal <https://www.waterqualitydata.us/>.
Maintained by Laura DeCicco. Last updated 5 days ago.
286 stars 14.16 score 1.7k scripts 15 dependentspharmaverse
admiral:ADaM in R Asset Library
A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).
Maintained by Ben Straub. Last updated 7 days ago.
cdiscclinical-trialsopen-source
239 stars 13.97 score 486 scripts 4 dependentstidymodels
workflows:Modeling Workflows
Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.
Maintained by Simon Couch. Last updated 1 months ago.
207 stars 13.97 score 876 scripts 43 dependentsjbkunst
highcharter:A Wrapper for the 'Highcharts' Library
A wrapper for the 'Highcharts' library including shortcut functions to plot R objects. 'Highcharts' <https://www.highcharts.com/> is a charting library offering numerous chart types with a simple configuration syntax.
Maintained by Joshua Kunst. Last updated 1 years ago.
highchartshtmlwidgetsshinyshiny-rvisualizationwrapper
725 stars 13.93 score 4.9k scripts 18 dependentsaphalo
ggpmisc:Miscellaneous Extensions to 'ggplot2'
Extensions to 'ggplot2' respecting the grammar of graphics paradigm. Statistics: locate and tag peaks and valleys; label plot with the equation of a fitted polynomial or other types of models; labels with P-value, R^2 or adjusted R^2 or information criteria for fitted models; label with ANOVA table for fitted models; label with summary for fitted models. Model fit classes for which suitable methods are provided by package 'broom' and 'broom.mixed' are supported. Scales and stats to build volcano and quadrant plots based on outcomes, fold changes, p-values and false discovery rates.
Maintained by Pedro J. Aphalo. Last updated 2 days ago.
data-analysisdatavizggplot2-annotationsggplot2-statsstatistics
107 stars 13.64 score 4.4k scripts 14 dependentstidyverts
fable:Forecasting Models for Tidy Time Series
Provides a collection of commonly used univariate and multivariate time series forecasting models including automatically selected exponential smoothing (ETS) and autoregressive integrated moving average (ARIMA) models. These models work within the 'fable' framework provided by the 'fabletools' package, which provides the tools to evaluate, visualise, and combine models in a workflow consistent with the tidyverse.
Maintained by Mitchell OHara-Wild. Last updated 4 months ago.
569 stars 13.54 score 2.1k scripts 6 dependentsbusiness-science
tidyquant:Tidy Quantitative Financial Analysis
Bringing business and financial analysis to the 'tidyverse'. The 'tidyquant' package provides a convenient wrapper to various 'xts', 'zoo', 'quantmod', 'TTR' and 'PerformanceAnalytics' package functions and returns the objects in the tidy 'tibble' format. The main advantage is being able to use quantitative functions with the 'tidyverse' functions including 'purrr', 'dplyr', 'tidyr', 'ggplot2', 'lubridate', etc. See the 'tidyquant' website for more information, documentation and examples.
Maintained by Matt Dancho. Last updated 2 months ago.
dplyrfinancial-analysisfinancial-datafinancial-statementsmultiple-stocksperformance-analysisperformanceanalyticsquantmodstockstock-exchangesstock-indexesstock-listsstock-performancestock-pricesstock-symboltidyversetime-seriestimeseriesxts
872 stars 13.34 score 5.2k scriptsoscarkjell
text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Maintained by Oscar Kjell. Last updated 10 days ago.
deep-learningmachine-learningnlptransformersopenjdk
145 stars 13.21 score 436 scripts 1 dependentswadpac
GGIR:Raw Accelerometer Data Analysis
A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.
Maintained by Vincent T van Hees. Last updated 18 days ago.
accelerometeractivity-recognitioncircadian-rhythmmovement-sensorsleep
109 stars 13.20 score 342 scripts 3 dependentsopenair-project
openair:Tools for the Analysis of Air Pollution Data
Tools to analyse, interpret and understand air pollution data. Data are typically regular time series and air quality measurement, meteorological data and dispersion model output can be analysed. The package is described in Carslaw and Ropkins (2012, <doi:10.1016/j.envsoft.2011.09.008>) and subsequent papers.
Maintained by David Carslaw. Last updated 4 days ago.
air-qualityair-quality-datameteorologyopenaircpp
316 stars 12.94 score 1.2k scripts 12 dependentsaphalo
ggpp:Grammar Extensions to 'ggplot2'
Extensions to 'ggplot2' respecting the grammar of graphics paradigm. Geometries: geom_table(), geom_plot() and geom_grob() add insets to plots using native data coordinates, while geom_table_npc(), geom_plot_npc() and geom_grob_npc() do the same using "npc" coordinates through new aesthetics "npcx" and "npcy". Statistics: select observations based on 2D density. Positions: radial nudging away from a center point and nudging away from a line or curve; combined stacking and nudging; combined dodging and nudging.
Maintained by Pedro J. Aphalo. Last updated 1 months ago.
data-labelsdatavizggplot2-enhancementsggplot2-geomsggplot2-insetsggplot2-positions
129 stars 12.53 score 582 scripts 26 dependentstidyverts
feasts:Feature Extraction and Statistics for Time Series
Provides a collection of features, decomposition methods, statistical summaries and graphics functions for the analysing tidy time series data. The package name 'feasts' is an acronym comprising of its key features: Feature Extraction And Statistics for Time Series.
Maintained by Mitchell OHara-Wild. Last updated 5 months ago.
300 stars 12.38 score 1.4k scripts 7 dependentseliocamp
metR:Tools for Easier Analysis of Meteorological Fields
Many useful functions and extensions for dealing with meteorological data in the tidy data framework. Extends 'ggplot2' for better plotting of scalar and vector fields and provides commonly used analysis methods in the atmospheric sciences.
Maintained by Elio Campitelli. Last updated 12 days ago.
atmospheric-scienceggplot2visualization
146 stars 12.30 score 1000 scripts 22 dependentsr-dbi
RMariaDB:Database Interface and MariaDB Driver
Implements a DBI-compliant interface to MariaDB (<https://mariadb.org/>) and MySQL (<https://www.mysql.com/>) databases.
Maintained by Kirill Mรผller. Last updated 1 months ago.
133 stars 12.20 score 792 scripts 10 dependentstidyverts
fabletools:Core Tools for Packages in the 'fable' Framework
Provides tools, helpers and data structures for developing models and time series functions for 'fable' and extension packages. These tools support a consistent and tidy interface for time series modelling and analysis.
Maintained by Mitchell OHara-Wild. Last updated 2 months ago.
91 stars 12.18 score 396 scripts 18 dependentstidymodels
probably:Tools for Post-Processing Predicted Values
Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.
Maintained by Max Kuhn. Last updated 6 months ago.
115 stars 12.09 score 21k scripts 1 dependentsropensci
RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management
Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.
Maintained by Mathew W. McLean. Last updated 4 months ago.
115 stars 12.06 score 2.3k scripts 16 dependentstidymodels
workflowsets:Create a Collection of 'tidymodels' Workflows
A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.
Maintained by Simon Couch. Last updated 5 months ago.
94 stars 12.04 score 294 scripts 19 dependentszachmayer
caretEnsemble:Ensembles of Caret Models
Functions for creating ensembles of caret models: caretList() and caretStack(). caretList() is a convenience function for fitting multiple caret::train() models to the same dataset. caretStack() will make linear or non-linear combinations of these models, using a caret::train() model as a meta-model.
Maintained by Zachary A. Deane-Mayer. Last updated 3 months ago.
226 stars 11.98 score 780 scripts 1 dependentspecanproject
PEcAn.DB:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 11.90 score 127 scripts 27 dependentsepiforecasts
EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.
Maintained by Sebastian Funk. Last updated 1 months ago.
backcalculationcovid-19gaussian-processesopen-sourcereproduction-numberstancpp
123 stars 11.86 score 210 scriptshannameyer
CAST:'caret' Applications for Spatial-Temporal Models
Supporting functionality to run 'caret' with spatial or spatial-temporal data. 'caret' is a frequently used package for model training and prediction using machine learning. CAST includes functions to improve spatial or spatial-temporal modelling tasks using 'caret'. It includes the newly suggested 'Nearest neighbor distance matching' cross-validation to estimate the performance of spatial prediction models and allows for spatial variable selection to selects suitable predictor variables in view to their contribution to the spatial model performance. CAST further includes functionality to estimate the (spatial) area of applicability of prediction models. Methods are described in Meyer et al. (2018) <doi:10.1016/j.envsoft.2017.12.001>; Meyer et al. (2019) <doi:10.1016/j.ecolmodel.2019.108815>; Meyer and Pebesma (2021) <doi:10.1111/2041-210X.13650>; Milร et al. (2022) <doi:10.1111/2041-210X.13851>; Meyer and Pebesma (2022) <doi:10.1038/s41467-022-29838-9>; Linnenbrink et al. (2023) <doi:10.5194/egusphere-2023-1308>; Schumacher et al. (2024) <doi:10.5194/egusphere-2024-2730>. The package is described in detail in Meyer et al. (2024) <doi:10.48550/arXiv.2404.06978>.
Maintained by Hanna Meyer. Last updated 2 months ago.
autocorrelationcaretfeature-selectionmachine-learningoverfittingpredictive-modelingspatialspatio-temporalvariable-selection
114 stars 11.85 score 298 scripts 1 dependentspecanproject
PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 11.62 score 64 scripts 14 dependentsedwinth
padr:Quickly Get Datetime Data Ready for Analysis
Transforms datetime data into a format ready for analysis. It offers two core functionalities; aggregating data to a higher level interval (thicken) and imputing records where observations were absent (pad).
Maintained by Edwin Thoen. Last updated 4 months ago.
132 stars 11.55 score 428 scripts 20 dependentsurbananalyst
dodgr:Distances on Directed Graphs
Distances on dual-weighted directed graphs using priority-queue shortest paths (Padgham (2019) <doi:10.32866/6945>). Weighted directed graphs have weights from A to B which may differ from those from B to A. Dual-weighted directed graphs have two sets of such weights. A canonical example is a street network to be used for routing in which routes are calculated by weighting distances according to the type of way and mode of transport, yet lengths of routes must be calculated from direct distances.
Maintained by Mark Padgham. Last updated 4 days ago.
distanceopenstreetmaproutershortest-pathsstreet-networkscpp
129 stars 11.52 score 229 scripts 4 dependentstidymodels
stacks:Tidy Model Stacking
Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.
Maintained by Simon Couch. Last updated 5 months ago.
298 stars 11.46 score 840 scriptsdoi-usgs
nhdplusTools:NHDPlus Tools
Tools for traversing and working with National Hydrography Dataset Plus (NHDPlus) data. All methods implemented in 'nhdplusTools' are available in the NHDPlus documentation available from the US Environmental Protection Agency <https://www.epa.gov/waterdata/basic-information>.
Maintained by David Blodgett. Last updated 1 months ago.
87 stars 11.38 score 348 scripts 5 dependentsmoderndive
moderndive:Tidyverse-Friendly Introductory Linear Regression
Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.
Maintained by Albert Y. Kim. Last updated 3 months ago.
88 stars 11.32 score 1.8k scriptsropengov
eurostat:Tools for Eurostat Open Data
Tools to download data from the Eurostat database <https://ec.europa.eu/eurostat> together with search and manipulation utilities.
Maintained by Leo Lahti. Last updated 1 months ago.
242 stars 11.07 score 892 scripts 4 dependentspecanproject
PEcAn.utils:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Rob Kooper. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 10.94 score 218 scripts 35 dependentstidymodels
textrecipes:Extra 'Recipes' for Text Processing
Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
Maintained by Emil Hvitfeldt. Last updated 13 days ago.
160 stars 10.86 score 964 scripts 1 dependentspecanproject
PEcAn.benchmark:PEcAn Functions Used for Benchmarking
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.benchmark package provides utilities for comparing models and data, including a suite of statistical metrics and plots.
Maintained by Mike Dietze. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 10.72 score 416 scripts 11 dependentsjimmyday12
fitzRoy:Easily Scrape and Process AFL Data
An easy package for scraping and processing Australia Rules Football (AFL) data. 'fitzRoy' provides a range of functions for accessing publicly available data from 'AFL Tables' <https://afltables.com/afl/afl_index.html>, 'Footy Wire' <https://www.footywire.com> and 'The Squiggle' <https://squiggle.com.au>. Further functions allow for easy processing, cleaning and transformation of this data into formats that can be used for analysis.
Maintained by James Day. Last updated 11 days ago.
136 stars 10.72 score 324 scriptsdoi-usgs
EGRET:Exploration and Graphics for RivEr Trends
Statistics and graphics for streamflow history, water quality trends, and the statistical modeling algorithm: Weighted Regressions on Time, Discharge, and Season (WRTDS).
Maintained by Laura DeCicco. Last updated 4 months ago.
usgswater-qualitywater-quality-data
90 stars 10.67 score 362 scripts 1 dependentsbusiness-science
modeltime:The Tidymodels Extension for Time Series Modeling
The time series forecasting framework for use with the 'tidymodels' ecosystem. Models include ARIMA, Exponential Smoothing, and additional time series models from the 'forecast' and 'prophet' packages. Refer to "Forecasting Principles & Practice, Second edition" (<https://otexts.com/fpp2/>). Refer to "Prophet: forecasting at scale" (<https://research.facebook.com/blog/2017/02/prophet-forecasting-at-scale/>.).
Maintained by Matt Dancho. Last updated 5 months ago.
arimadata-sciencedeep-learningetsforecastingmachine-learningmachine-learning-algorithmsmodeltimeprophettbatstidymodelingtidymodelstimetime-seriestime-series-analysistimeseriestimeseries-forecasting
551 stars 10.61 score 1.1k scripts 7 dependentsjmsigner
amt:Animal Movement Tools
Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.
Maintained by Johannes Signer. Last updated 5 months ago.
41 stars 10.54 score 418 scriptsbusiness-science
tibbletime:Time Aware Tibbles
Built on top of the 'tibble' package, 'tibbletime' is an extension that allows for the creation of time aware tibbles. Some immediate advantages of this include: the ability to perform time-based subsetting on tibbles, quickly summarising and aggregating results by time periods, and creating columns that can be used as 'dplyr' time-based groups.
Maintained by Davis Vaughan. Last updated 4 months ago.
periodicitytibbletimetime-seriestimeseriescpp
177 stars 10.51 score 644 scripts 2 dependentsvubiostat
redcapAPI:Interface to 'REDCap'
Access data stored in 'REDCap' databases using the Application Programming Interface (API). 'REDCap' (Research Electronic Data CAPture; <https://projectredcap.org>, Harris, et al. (2009) <doi:10.1016/j.jbi.2008.08.010>, Harris, et al. (2019) <doi:10.1016/j.jbi.2019.103208>) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The API allows users to access data and project meta data (such as the data dictionary) from the web programmatically. The 'redcapAPI' package facilitates the process of accessing data with options to prepare an analysis-ready data set consistent with the definitions in a database's data dictionary.
Maintained by Shawn Garbett. Last updated 25 days ago.
22 stars 10.47 score 134 scripts 2 dependentsropensci
rerddap:General Purpose Client for 'ERDDAPโข' Servers
General purpose R client for 'ERDDAPโข' servers. Includes functions to search for 'datasets', get summary information on 'datasets', and fetch 'datasets', in either 'csv' or 'netCDF' format. 'ERDDAPโข' information: <https://upwell.pfeg.noaa.gov/erddap/information.html>.
Maintained by Roy Mendelssohn. Last updated 12 days ago.
earthscienceclimateprecipitationtemperaturestormbuoynoaaapi-clienterddapnoaa-data
41 stars 10.43 score 376 scripts 5 dependentstidymodels
themis:Extra Recipes Steps for Dealing with Unbalanced Data
A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <doi:10.48550/arXiv.1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.
Maintained by Emil Hvitfeldt. Last updated 2 months ago.
143 stars 10.37 score 1.3k scripts 2 dependentsgforge
Gmisc:Descriptive Statistics, Transition Plots, and More
Tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bรฉzier lines with arrows complementing the ones in the 'grid' package, and more.
Maintained by Max Gordon. Last updated 2 years ago.
51 stars 10.34 score 233 scripts 2 dependentsludvigolsen
cvms:Cross-Validation for Model Selection
Cross-validate one or multiple regression and classification models and get relevant evaluation metrics in a tidy format. Validate the best model on a test set and compare it to a baseline evaluation. Alternatively, evaluate predictions from an external model. Currently supports regression and classification (binary and multiclass). Described in chp. 5 of Jeyaraman, B. P., Olsen, L. R., & Wambugu M. (2019, ISBN: 9781838550134).
Maintained by Ludvig Renbo Olsen. Last updated 25 days ago.
39 stars 10.31 score 492 scripts 5 dependentsbioc
pRoloc:A unifying bioinformatics framework for spatial proteomics
The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.
Maintained by Lisa Breckels. Last updated 5 days ago.
immunooncologyproteomicsmassspectrometryclassificationclusteringqualitycontrolbioconductorproteomics-dataspatial-proteomicsvisualisationopenblascpp
15 stars 10.31 score 101 scripts 2 dependentsfacebookexperimental
Robyn:Semi-Automated Marketing Mix Modeling (MMM) from Meta Marketing Science
Semi-Automated Marketing Mix Modeling (MMM) aiming to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocation and diminishing returns curves and allows ground-truth calibration to account for causation.
Maintained by Gufeng Zhou. Last updated 13 days ago.
adstockingbudget-allocationcost-response-curveeconometricsevolutionary-algorithmgradient-based-optimisationhyperparameter-optimizationmarketing-mix-modelingmarketing-mix-modellingmarketing-sciencemmmridge-regression
1.3k stars 10.27 score 95 scriptsropensci
qualtRics:Download 'Qualtrics' Survey Data
Provides functions to access survey results directly into R using the 'Qualtrics' API. 'Qualtrics' <https://www.qualtrics.com/about/> is an online survey and data collection software platform. See <https://api.qualtrics.com/> for more information about the 'Qualtrics' API. This package is community-maintained and is not officially supported by 'Qualtrics'.
Maintained by Julia Silge. Last updated 7 months ago.
apiqualtricsqualtrics-apisurveysurvey-data
221 stars 10.23 score 272 scriptsidigbio
ridigbio:Interface to the iDigBio Data API
An interface to iDigBio's search API that allows downloading specimen records. Searches are returned as a data.frame. Other functions such as the metadata end points return lists of information. iDigBio is a US project focused on digitizing and serving museum specimen collections on the web. See <https://www.idigbio.org> for information on iDigBio.
Maintained by Jesse Bennett. Last updated 20 days ago.
16 stars 10.23 score 63 scripts 7 dependentscboettig
knitcitations:Citations for 'Knitr' Markdown Files
Provides the ability to create dynamic citations in which the bibliographic information is pulled from the web rather than having to be entered into a local database such as 'bibtex' ahead of time. The package is primarily aimed at authoring in the R 'markdown' format, and can provide outputs for web-based authoring such as linked text for inline citations. Cite using a 'DOI', URL, or 'bibtex' file key. See the package URL for details.
Maintained by Carl Boettiger. Last updated 4 years ago.
220 stars 10.14 score 836 scripts 2 dependentsdslc-io
tidytuesdayR:Access the Weekly 'TidyTuesday' Project Dataset
'TidyTuesday' is a project by the 'Data Science Learning Community' in which they post a weekly dataset in a public data repository (<https://github.com/rfordatascience/tidytuesday>) for people to analyze and visualize. This package provides the tools to easily download this data and the description of the source.
Maintained by Jon Harmon. Last updated 6 days ago.
77 stars 10.13 score 3.0k scriptsnflverse
nflfastR:Functions to Efficiently Access NFL Play by Play Data
A set of functions to access National Football League play-by-play data from <https://www.nfl.com/>.
Maintained by Ben Baldwin. Last updated 15 hours ago.
american-footballfootball-datanflnflstatsnflversesports-analytics
443 stars 10.11 score 596 scripts 3 dependentsbleutner
RStoolbox:Remote Sensing Data Analysis
Toolbox for remote sensing image processing and analysis such as calculating spectral indexes, principal component transformation, unsupervised and supervised classification or fractional cover analyses.
Maintained by Konstantin Mueller. Last updated 2 months ago.
ggplot2land-cover-mappingremote-sensingspectral-unmixingsupervised-classificationunsupervised-classificationopenblascpp
275 stars 10.10 score 1.1k scriptsropensci
spocc:Interface to Species Occurrence Data Sources
A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility ('GBIF'), 'iNaturalist', 'eBird', Integrated Digitized 'Biocollections' ('iDigBio'), 'VertNet', Ocean 'Biogeographic' Information System ('OBIS'), and Atlas of Living Australia ('ALA'). Includes functionality for retrieving species occurrence data, and combining those data.
Maintained by Hannah Owens. Last updated 2 months ago.
specimensapiweb-servicesoccurrencesspeciestaxonomygbifinatvertnetebirdidigbioobisalaantwebbisondataecoengineinaturalistoccurrencespecies-occurrencespocc
118 stars 10.09 score 552 scripts 5 dependentsgshs-ornl
wbstats:Programmatic Access to Data and Statistics from the World Bank API
Search and download data from the World Bank Data API.
Maintained by Jesse Piburn. Last updated 4 years ago.
open-dataworld-bankworld-bank-apiworldbank
126 stars 10.07 score 1.1k scripts 3 dependentspecanproject
PEcAn.settings:PEcAn Settings package
Contains functions to read PEcAn settings files.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 10.02 score 54 scripts 17 dependentsropensci
nasapower:NASA POWER API Client
An API client for NASA POWER global meteorology, surface solar energy and climatology data API. POWER (Prediction Of Worldwide Energy Resources) data are freely available for download with varying spatial resolutions dependent on the original data and with several temporal resolutions depending on the POWER parameter and community. This work is funded through the NASA Earth Science Directorate Applied Science Program. For more on the data themselves, the methodologies used in creating, a web- based data viewer and web access, please see <https://power.larc.nasa.gov/>.
Maintained by Adam H. Sparks. Last updated 26 days ago.
nasameteorological-dataweatherglobalweather-datameteorologynasa-poweragroclimatologyearth-sciencedata-accessclimate-dataagroclimatology-dataweather-variables
101 stars 9.98 score 137 scripts 3 dependentspecanproject
PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Istem Fer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.96 score 20 scripts 2 dependentspecanproject
PEcAn.priors:PEcAn Functions Used to Estimate Priors from Data
Functions to estimate priors from data.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.95 score 13 scripts 6 dependentstlverse
sl3:Pipelines for Machine Learning and Super Learning
A modern implementation of the Super Learner prediction algorithm, coupled with a general purpose framework for composing arbitrary pipelines for machine learning tasks.
Maintained by Jeremy Coyle. Last updated 5 months ago.
data-scienceensemble-learningensemble-modelmachine-learningmodel-selectionregressionstackingstatistics
100 stars 9.94 score 748 scripts 7 dependentsdarwin-eu
CodelistGenerator:Identify Relevant Clinical Codes and Evaluate Their Use
Generate a candidate code list for the Observational Medical Outcomes Partnership (OMOP) common data model based on string matching. For a given search strategy, a candidate code list will be returned.
Maintained by Edward Burn. Last updated 5 days ago.
14 stars 9.94 score 165 scripts 4 dependentslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 1 months ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
233 stars 9.92 score 185 scripts 1 dependentspecanproject
PEcAn.MA:PEcAn Functions Used for Meta-Analysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.MA package contains the functions used in the Bayesian meta-analysis of trait data.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.91 score 7 scripts 7 dependentsbioc
OmnipathR:OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Maintained by Denes Turei. Last updated 1 months ago.
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
130 stars 9.90 score 226 scripts 2 dependentsjaseziv
worldfootballR:Extract and Clean World Football (Soccer) Data
Allow users to obtain clean and tidy football (soccer) game, team and player data. Data is collected from a number of popular sites, including 'FBref', transfer and valuations data from 'Transfermarkt'<https://www.transfermarkt.com/> and shooting location and other match stats data from 'Understat'<https://understat.com/>. It gives users the ability to access data more efficiently, rather than having to export data tables to files before being able to complete their analysis.
Maintained by Jason Zivkovic. Last updated 1 days ago.
fbreffootballfootball-datasoccer-datasports-datatransfermarktunderstat
509 stars 9.88 score 516 scripts 2 dependentsrstudio
distill:'R Markdown' Format for Scientific and Technical Writing
Scientific and technical article format for the web. 'Distill' articles feature attractive, reader-friendly typography, flexible layout options for visualizations, and full support for footnotes and citations.
Maintained by Christophe Dervieux. Last updated 1 years ago.
423 stars 9.85 score 402 scripts 6 dependentscardiomoon
moonBook:Functions and Datasets for the Book by Keon-Woong Moon
Several analysis-related functions for the book entitled "R statistics and graph for medical articles" (written in Korean), version 1, by Keon-Woong Moon with Korean demographic data with several plot functions.
Maintained by Keon-Woong Moon. Last updated 1 years ago.
38 stars 9.79 score 278 scripts 6 dependentsinsightsengineering
teal.modules.general:General Modules for 'teal' Applications
Prebuilt 'shiny' modules containing tools for viewing data, visualizing data, understanding missing and outlier values within your data and performing simple data analysis. This extends 'teal' framework that supports reproducible research and analysis.
Maintained by Dawid Kaledkowski. Last updated 1 months ago.
general-purposemodulesnestshiny
13 stars 9.74 score 71 scriptsepiforecasts
socialmixr:Social Mixing Matrices for Infectious Disease Modelling
Provides methods for sampling contact matrices from diary data for use in infectious disease modelling, as discussed in Mossong et al. (2008) <doi:10.1371/journal.pmed.0050074>.
Maintained by Sebastian Funk. Last updated 6 months ago.
38 stars 9.74 score 227 scripts 1 dependentsprestodb
RPresto:DBI Connector to Presto
Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: <https://prestodb.io/>.
Maintained by Jarod G.R. Meng. Last updated 2 months ago.
132 stars 9.73 score 25 scripts 4 dependentspecanproject
PEcAnRTM:PEcAn Functions Used for Radiative Transfer Modeling
Functions for performing forward runs and inversions of radiative transfer models (RTMs). Inversions can be performed using maximum likelihood, or more complex hierarchical Bayesian methods. Underlying numerical analyses are optimized for speed using Fortran code.
Maintained by Alexey Shiklomanov. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsfortranjagscpp
216 stars 9.70 score 132 scriptsappsilon
shiny.telemetry:'Shiny' App Usage Telemetry
Enables instrumentation of 'Shiny' apps for tracking user session events such as input changes, browser type, and session duration. These events can be sent to any of the available storage backends and analyzed using the included 'Shiny' app to gain insights about app usage and adoption.
Maintained by Andrรฉ Verรญssimo. Last updated 4 months ago.
67 stars 9.69 score 29 scriptsbblonder
hypervolume:High Dimensional Geometry, Set Operations, Projection, and Inference Using Kernel Density Estimation, Support Vector Machines, and Convex Hulls
Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.
Maintained by Benjamin Blonder. Last updated 2 months ago.
23 stars 9.69 score 211 scripts 7 dependentsfmmattioni
downloadthis:Implement Download Buttons in 'rmarkdown'
Implement download buttons in HTML output from 'rmarkdown' without the need for 'runtime:shiny'.
Maintained by Felipe Mattioni Maturana. Last updated 6 months ago.
146 stars 9.63 score 856 scripts 1 dependentsropensci
tidyhydat:Extract and Tidy Canadian 'Hydrometric' Data
Provides functions to access historical and real-time national 'hydrometric' data from Water Survey of Canada data sources (<https://dd.weather.gc.ca/hydrometric/csv/> and <https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>) and then applies tidy data principles.
Maintained by Sam Albers. Last updated 20 days ago.
citzgovernment-datahydrologyhydrometricstidy-datawater-resources
71 stars 9.59 score 202 scripts 3 dependentsbusiness-science
anomalize:Tidy Anomaly Detection
The 'anomalize' package enables a "tidy" workflow for detecting anomalies in data. The main functions are time_decompose(), anomalize(), and time_recompose(). When combined, it's quite simple to decompose time series, detect anomalies, and create bands separating the "normal" data from the anomalous data at scale (i.e. for multiple time series). Time series decomposition is used to remove trend and seasonal components via the time_decompose() function and methods include seasonal decomposition of time series by Loess ("stl") and seasonal decomposition by piecewise medians ("twitter"). The anomalize() function implements two methods for anomaly detection of residuals including using an inner quartile range ("iqr") and generalized extreme studentized deviation ("gesd"). These methods are based on those used in the 'forecast' package and the Twitter 'AnomalyDetection' package. Refer to the associated functions for specific references for these methods.
Maintained by Matt Dancho. Last updated 1 years ago.
anomalyanomaly-detectiondecompositiondetect-anomaliesiqrtime-series
339 stars 9.56 score 332 scriptsndphillips
FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees
Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.
Maintained by Hansjoerg Neth. Last updated 6 months ago.
136 stars 9.53 score 144 scriptse-sensing
sits:Satellite Image Time Series Analysis for Earth Observation Data Cubes
An end-to-end toolkit for land use and land cover classification using big Earth observation data, based on machine learning methods applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>. Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, Brazil Data Cube, Copernicus Data Space Environment (CDSE), Digital Earth Africa, Digital Earth Australia, NASA HLS using the Spatio-temporal Asset Catalog (STAC) protocol (<https://stacspec.org/>) and the 'gdalcubes' R package developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>. Supports visualization methods for images and time series and smoothing filters for dealing with noisy time series. Includes functions for quality assessment of training samples using self-organized maps as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. Includes methods to reduce training samples imbalance proposed by Chawla et al (2002) <doi:10.1613/jair.953>. Provides machine learning methods including support vector machines, random forests, extreme gradient boosting, multi-layer perceptrons, temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, and temporal attention encoders by Garnot and Landrieu (2020) <doi:10.48550/arXiv.2007.00586>. Supports GPU processing of deep learning models using torch <https://torch.mlverse.org/>. Performs efficient classification of big Earth observation data cubes and includes functions for post-classification smoothing based on Bayesian inference as described by Camara et al (2024) <doi:10.3390/rs16234572>, and methods for active learning and uncertainty assessment. Supports region-based time series analysis using package supercells <https://jakubnowosad.com/supercells/>. Enables best practices for estimating area and assessing accuracy of land change as recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>. Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.
Maintained by Gilberto Camara. Last updated 2 months ago.
big-earth-datacbersearth-observationeo-datacubesgeospatialimage-time-seriesland-cover-classificationlandsatplanetary-computerr-spatialremote-sensingrspatialsatellite-image-time-seriessatellite-imagerysentinel-2stac-apistac-catalogcpp
494 stars 9.50 score 384 scriptsdatastorm-open
suncalc:Compute Sun Position, Sunlight Phases, Moon Position and Lunar Phase
Get sun position, sunlight phases (times for sunrise, sunset, dusk, etc.), moon position and lunar phase for the given location and time. Most calculations are based on the formulas given in Astronomy Answers articles about position of the sun and the planets : <https://www.aa.quae.nl/en/reken/zonpositie.html>.
Maintained by Benoit Thieurmel. Last updated 1 years ago.
44 stars 9.43 score 372 scripts 16 dependentssizespectrum
mizer:Dynamic Multi-Species Size Spectrum Modelling
A set of classes and methods to set up and run multi-species, trait based and community size spectrum ecological models, focused on the marine environment.
Maintained by Gustav Delius. Last updated 2 months ago.
ecosystem-modelfish-population-dynamicsfisheriesfisheries-managementmarine-ecosystempopulation-dynamicssimulationsize-structurespecies-interactionstransport-equationcpp
39 stars 9.41 score 207 scriptsropensci
rnoaa:'NOAA' Weather Data from R
Client for many 'NOAA' data sources including the 'NCDC' climate 'API' at <https://www.ncdc.noaa.gov/cdo-web/webservices/v2>, with functions for each of the 'API' 'endpoints': data, data categories, data sets, data types, locations, location categories, and stations. In addition, we have an interface for 'NOAA' sea ice data, the 'NOAA' severe weather inventory, 'NOAA' Historical Observing 'Metadata' Repository ('HOMR') data, 'NOAA' storm data via 'IBTrACS', tornado data via the 'NOAA' storm prediction center, and more.
Maintained by Daniel Hocking. Last updated 2 months ago.
earthscienceclimateprecipitationtemperaturestormbuoyncdcnoaatornadoesea iceisdnoaa-data
334 stars 9.39 score 788 scripts 4 dependentsusepa
tcpl:ToxCast Data Analysis Pipeline
The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.
Maintained by Jason Brown. Last updated 12 days ago.
36 stars 9.39 score 90 scriptspecanproject
PEcAn.data.land:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Mike Dietze. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.34 score 19 scripts 10 dependentsrte-antares-rpackage
antaresRead:Import, Manipulate and Explore the Results of an 'Antares' Simulation
Import, manipulate and explore results generated by 'Antares', a powerful open source software developed by RTE (Rรฉseau de Transport dโรlectricitรฉ) to simulate and study electric power systems (more information about 'Antares' here : <https://antares-simulator.org/>).
Maintained by Tatiana Vargas. Last updated 6 days ago.
infrastructuredataimportadequacybilanelectricityenergyhdf5linear-algebramonte-carlo-simulationoptimisationprevisionnelrhdf5rtesimulationtyndp
13 stars 9.32 score 148 scripts 3 dependentsmicrosoft
finnts:Microsoft Finance Time Series Forecasting Framework
Automated time series forecasting developed by Microsoft Finance. The Microsoft Finance Time Series Forecasting Framework, aka Finn, can be used to forecast any component of the income statement, balance sheet, or any other area of interest by finance. Any numerical quantity over time, Finn can be used to forecast it. While it can be applied outside of the finance domain, Finn was built to meet the needs of financial analysts to better forecast their businesses within a company, and has a lot of built in features that are specific to the needs of financial forecasters. Happy forecasting!
Maintained by Mike Tokic. Last updated 1 months ago.
businessdata-sciencefeature-selectionfinancefinntsforecastingmachine-learningmicrosofttime-series
194 stars 9.30 score 39 scriptssbegueria
SPEI:Calculation of the Standardized Precipitation-Evapotranspiration Index
A set of functions for computing potential evapotranspiration and several widely used drought indices including the Standardized Precipitation-Evapotranspiration Index (SPEI).
Maintained by Santiago Beguerรญa. Last updated 2 years ago.
82 stars 9.27 score 314 scripts 20 dependentsstevenmmortimer
salesforcer:An Implementation of 'Salesforce' APIs Using Tidy Principles
Functions connecting to the 'Salesforce' Platform APIs (REST, SOAP, Bulk 1.0, Bulk 2.0, Metadata, Reports and Dashboards) <https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_overview>. "API" is an acronym for "application programming interface". Most all calls from these APIs are supported as they use CSV, XML or JSON data that can be parsed into R data structures. For more details please see the 'Salesforce' API documentation and this package's website <https://stevenmmortimer.github.io/salesforcer/> for more information, documentation, and examples.
Maintained by Steven M. Mortimer. Last updated 5 months ago.
api-wrappersr-languager-programmingsalesforcesalesforce-apis
82 stars 9.27 score 191 scriptsbusiness-science
sweep:Tidy Tools for Forecasting
Tidies up the forecasting modeling and prediction work flow, extends the 'broom' package with 'sw_tidy', 'sw_glance', 'sw_augment', and 'sw_tidy_decomp' functions for various forecasting models, and enables converting 'forecast' objects to "tidy" data frames with 'sw_sweep'.
Maintained by Matt Dancho. Last updated 1 years ago.
broomforecastforecasting-modelspredictiontidytidyversetimetime-seriestimeseries
155 stars 9.23 score 399 scripts 1 dependentsbioc
rWikiPathways:rWikiPathways - R client library for the WikiPathways API
Use this package to interface with the WikiPathways API. It provides programmatic access to WikiPathways content in multiple data and image formats, including official monthly release files and convenient GMT read/write functions.
Maintained by Egon Willighagen. Last updated 5 months ago.
visualizationgraphandnetworkthirdpartyclientnetworkmetabolomicsbioinformaticsdata-accesspathways
15 stars 9.23 score 131 scripts 3 dependentsaphalo
photobiology:Photobiological Calculations
Definitions of classes, methods, operators and functions for use in photobiology and radiation meteorology and climatology. Calculation of effective (weighted) and not-weighted irradiances/doses, fluence rates, transmittance, reflectance, absorptance, absorbance and diverse ratios and other derived quantities from spectral data. Local maxima and minima: peaks, valleys and spikes. Conversion between energy-and photon-based units. Wavelength interpolation. Astronomical calculations related solar angles and day length. Colours and vision. This package is part of the 'r4photobiology' suite, Aphalo, P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.
Maintained by Pedro J. Aphalo. Last updated 2 days ago.
lightphotobiologyquantificationr4photobiology-suiteradiationspectrasun-position
4 stars 9.20 score 604 scripts 12 dependentswhipson
maestro:Orchestration of Data Pipelines
Framework for creating and orchestrating data pipelines. Organize, orchestrate, and monitor multiple pipelines in a single project. Use tags to decorate functions with scheduling parameters and configuration.
Maintained by Will Hipson. Last updated 7 days ago.
119 stars 9.20 score 150 scriptstidymodels
embed:Extra Recipes for Encoding Predictors
Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.
Maintained by Emil Hvitfeldt. Last updated 2 months ago.
142 stars 9.18 score 1.1k scriptsbodkan
slendr:A Simulation Framework for Spatiotemporal Population Genetics
A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.
Maintained by Martin Petr. Last updated 3 days ago.
popgenpopulation-geneticssimulationsspatial-statistics
56 stars 9.13 score 88 scriptspecanproject
PEcAn.allometry:PEcAn Allometry Functions
Synthesize allometric equations or fit allometries to data.
Maintained by Mike Dietze. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 9.12 score 34 scriptsmalaria-atlas-project
malariaAtlas:An R Interface to Open-Access Malaria Data, Hosted by the 'Malaria Atlas Project'
A suite of tools to allow you to download all publicly available parasite rate survey points, mosquito occurrence points and raster surfaces from the 'Malaria Atlas Project' <https://malariaatlas.org/> servers as well as utility functions for plotting the downloaded data.
Maintained by Mauricio van den Berg. Last updated 9 months ago.
44 stars 9.10 score 118 scripts 3 dependentshuizezhang-sherry
cubble:A Vector Spatio-Temporal Data Structure for Data Analysis
A spatiotemperal data object in a relational data structure to separate the recording of time variant/ invariant variables. See the Journal of Statistical Software reference: <doi:10.18637/jss.v110.i07>.
Maintained by H. Sherry Zhang. Last updated 6 months ago.
57 stars 9.07 score 83 scriptspecanproject
PEcAn.qaqc:QAQC
PEcAn integration and model skill testing
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 9.06 score 5 scriptsbioc
BatchQC:Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Maintained by Jessica Anderson. Last updated 13 days ago.
batcheffectgraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
7 stars 9.06 score 54 scriptsbupaverse
bupaR:Business Process Analysis in R
Comprehensive Business Process Analysis toolkit. Creates S3-class for event log objects, and related handler functions. Imports related packages for filtering event data, computation of descriptive statistics, handling of 'Petri Net' objects and visualization of process maps. See also packages 'edeaR','processmapR', 'eventdataR' and 'processmonitR'.
Maintained by Gert Janssenswillen. Last updated 2 years ago.
57 stars 9.06 score 389 scripts 11 dependentsmlverse
tabnet:Fit 'TabNet' Models for Classification and Regression
Implements the 'TabNet' model by Sercan O. Arik et al. (2019) <doi:10.48550/arXiv.1908.07442> with 'Coherent Hierarchical Multi-label Classification Networks' by Giunchiglia et al. <doi:10.48550/arXiv.2010.10151> and provides a consistent interface for fitting and creating predictions. It's also fully compatible with the 'tidymodels' ecosystem.
Maintained by Christophe Regouby. Last updated 1 days ago.
109 stars 9.05 score 65 scriptsropensci
ijtiff:Comprehensive TIFF I/O with Full Support for 'ImageJ' TIFF Files
General purpose TIFF file I/O for R users. Currently the only such package with read and write support for TIFF files with floating point (real-numbered) pixels, and the only package that can correctly import TIFF files that were saved from 'ImageJ' and write TIFF files than can be correctly read by 'ImageJ' <https://imagej.net/ij/>. Also supports text image I/O.
Maintained by Rory Nolan. Last updated 9 days ago.
image-manipulationimagejpeer-reviewedtiff-filestiff-imagestiff
18 stars 9.03 score 36 scripts 7 dependentspecanproject
PEcAn.all:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.01 score 266 scriptsbupaverse
edeaR:Exploratory and Descriptive Event-Based Data Analysis
Exploratory and descriptive analysis of event based data. Provides methods for describing and selecting process data, and for preparing event log data for process mining. Builds on the S3-class for event logs implemented in the package 'bupaR'.
Maintained by Gert Janssenswillen. Last updated 4 months ago.
12 stars 9.01 score 149 scripts 8 dependentsirinagain
iglu:Interpreting Glucose Data from Continuous Glucose Monitors
Implements a wide range of metrics for measuring glucose control and glucose variability based on continuous glucose monitoring data. The list of implemented metrics is summarized in Rodbard (2009) <doi:10.1089/dia.2009.0015>. Additional visualization tools include time-series plots, lasagna plots and ambulatory glucose profile report.
Maintained by Irina Gaynanova. Last updated 26 days ago.
26 stars 9.00 score 39 scriptsramikrispin
TSstudio:Functions for Time Series Analysis and Forecasting
Provides a set of tools for descriptive and predictive analysis of time series data. That includes functions for interactive visualization of time series objects and as well utility functions for automation time series forecasting.
Maintained by Rami Krispin. Last updated 2 years ago.
forecastingtime-seriestimeseriestsstudiovisualization
424 stars 9.00 score 656 scriptsbillpetti
baseballr:Acquiring and Analyzing Baseball Data
Provides numerous utilities for acquiring and analyzing baseball data from online sources such as 'Baseball Reference' <https://www.baseball-reference.com/>, 'FanGraphs' <https://www.fangraphs.com/>, and the 'MLB Stats' API <https://www.mlb.com/>.
Maintained by Saiem Gilani. Last updated 5 months ago.
baseballpitchfxsabermetricsstatcast
380 stars 8.98 score 582 scriptspecanproject
PEcAn.MAAT:PEcAn Package for Integration of the MAAT Model
This module provides functions to wrap the MAAT model into the PEcAn workflows.
Maintained by Shawn Serbin. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 8.96 score 12 scriptspecanproject
PEcAn.BIOCRO:PEcAn Package for Integration of the BioCro Model
This module provides functions to link BioCro to PEcAn.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.95 score 23 scriptscjbarrie
academictwitteR:Access the Twitter Academic Research Product Track V2 API Endpoint
Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side.
Maintained by Christopher Barrie. Last updated 2 years ago.
275 stars 8.94 score 177 scriptspecanproject
PEcAn.uncertainty:PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.94 score 15 scripts 5 dependentspik-piam
remind2:The REMIND R package (2nd generation)
Contains the REMIND-specific routines for data and model output manipulation.
Maintained by Renato Rodrigues. Last updated 15 hours ago.
8.89 score 161 scripts 5 dependentspedrohcgs
DRDID:Doubly Robust Difference-in-Differences Estimators
Implements the locally efficient doubly robust difference-in-differences (DiD) estimators for the average treatment effect proposed by Sant'Anna and Zhao (2020) <doi:10.1016/j.jeconom.2020.06.003>. The estimator combines inverse probability weighting and outcome regression estimators (also implemented in the package) to form estimators with more attractive statistical properties. Two different estimation methods can be used to estimate the nuisance functions.
Maintained by Pedro H. C. SantAnna. Last updated 6 months ago.
92 stars 8.88 score 133 scripts 5 dependentsusdaforestservice
FIESTA:Forest Inventory Estimation and Analysis
A research estimation tool for analysts that work with sample-based inventory data from the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program.
Maintained by Grayson White. Last updated 5 days ago.
30 stars 8.84 score 62 scriptspecanproject
PEcAn.workflow:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides workhorse functions that can be used to run the major steps of a PEcAn analysis.
Maintained by David LeBauer. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.84 score 15 scripts 4 dependentsevolecolgroup
tidysdm:Species Distribution Models with Tidymodels
Fit species distribution models (SDMs) using the 'tidymodels' framework, which provides a standardised interface to define models and process their outputs. 'tidysdm' expands 'tidymodels' by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2023) <doi:10.1101/2023.07.24.550358>.
Maintained by Andrea Manica. Last updated 25 days ago.
species-distribution-modellingtidymodels
31 stars 8.82 score 51 scriptspbiecek
archivist:Tools for Storing, Restoring and Searching for R Objects
Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.
Maintained by Przemyslaw Biecek. Last updated 8 months ago.
74 stars 8.81 score 105 scripts 2 dependentsrjournal
rjtools:Preparing, Checking, and Submitting Articles to the 'R Journal'
Create an 'R Journal' 'Rmarkdown' template article, that will generate html and pdf versions of your paper. Check that the paper folder has all the required components needed for submission. Examples of 'R Journal' publications can be found at <https://journal.r-project.org>.
Maintained by Di Cook. Last updated 2 months ago.
33 stars 8.81 score 37 scripts 1 dependentsijlyttle
bsplus:Adds Functionality to the R Markdown + Shiny Bootstrap Framework
The Bootstrap framework lets you add some JavaScript functionality to your web site by adding attributes to your HTML tags - Bootstrap takes care of the JavaScript <https://getbootstrap.com/docs/3.3/javascript/>. If you are using R Markdown or Shiny, you can use these functions to create collapsible sections, accordion panels, modals, tooltips, popovers, and an accordion sidebar framework (not described at Bootstrap site). Please note this package was designed for Bootstrap 3.3.
Maintained by Ian Lyttle. Last updated 2 years ago.
147 stars 8.80 score 295 scripts 15 dependentspecanproject
PEcAn.data.remote:PEcAn Functions Used for Extracting Remote Sensing Data
PEcAn module for processing remote data. Python module requirements: requests, json, re, ast, panads, sys. If any of these modules are missing, install using pip install <module name>.
Maintained by Bailey Morrison. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 8.76 score 6 scripts 5 dependentspecanproject
PEcAn.ED2:PEcAn Package for Integration of ED2 Model
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides functions to link the Ecosystem Demography Model, version 2, to PEcAn.
Maintained by Mike Dietze. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.74 score 145 scriptsadokter
bioRad:Biological Analysis and Visualization of Weather Radar Data
Extract, visualize and summarize aerial movements of birds and insects from weather radar data. See Dokter, A. M. et al. (2018) "bioRad: biological analysis and visualization of weather radar data" <doi:10.1111/ecog.04028> for a software paper describing package and methodologies.
Maintained by Adriaan M. Dokter. Last updated 4 hours ago.
aeroecologyenrameumetnet-operalifewatchmovement-ecologynexradoscibioradarweather-radarwsr-88d
29 stars 8.73 score 56 scriptsbioc
CellBench:Construct Benchmarks for Single Cell Analysis Methods
This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.
Maintained by Shian Su. Last updated 5 months ago.
softwareinfrastructuresinglecellbenchmarkbioinformatics
31 stars 8.73 score 98 scriptsropensci
comtradr:Interface with the United Nations Comtrade API
Interface with and extract data from the United Nations 'Comtrade' API <https://comtradeplus.un.org/>. 'Comtrade' provides country level shipping data for a variety of commodities, these functions allow for easy API query and data returned as a tidy data frame.
Maintained by Paul Bochtler. Last updated 5 months ago.
apicomtradepeer-reviewedsupply-chain
66 stars 8.67 score 70 scriptspharmaverse
admiralonco:Oncology Extension Package for ADaM in 'R' Asset Library
Programming oncology specific Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in 'R'. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team (2021), <https://www.cdisc.org/standards/foundational/adam>). The package is an extension package of the 'admiral' package.
Maintained by Stefan Bundfuss. Last updated 2 months ago.
32 stars 8.66 score 30 scriptsrte-antares-rpackage
antaresEditObject:Edit an 'Antares' Simulation
Edit an 'Antares' simulation before running it : create new areas, links, thermal clusters or binding constraints or edit existing ones. Update 'Antares' general & optimization settings. 'Antares' is an open source power system generator, more information available here : <https://antares-simulator.org/>.
Maintained by Tatiana Vargas. Last updated 1 months ago.
antares-simulationclusterenergymonte-carlo-simulationrte
8 stars 8.66 score 101 scriptsopen-eo
openeo:Client Interface for 'openEO' Servers
Access data and processing functionalities of 'openEO' compliant back-ends in R.
Maintained by Florian Lahn. Last updated 2 months ago.
65 stars 8.65 score 128 scriptsjniedballa
camtrapR:Camera Trap Data Management and Preparation of Occupancy and Spatial Capture-Recapture Analyses
Management of and data extraction from camera trap data in wildlife studies. The package provides a workflow for storing and sorting camera trap photos (and videos), tabulates records of species and individuals, and creates detection/non-detection matrices for occupancy and spatial capture-recapture analyses with great flexibility. In addition, it can visualise species activity data and provides simple mapping functions with GIS export.
Maintained by Juergen Niedballa. Last updated 4 months ago.
occupancy-modelingspatial-capture-recapturewildlife
35 stars 8.65 score 178 scriptsaloy
HLMdiag:Diagnostic Tools for Hierarchical (Multilevel) Linear Models
A suite of diagnostic tools for hierarchical (multilevel) linear models. The tools include not only leverage and traditional deletion diagnostics (Cook's distance, covratio, covtrace, and MDFFITS) but also convenience functions and graphics for residual analysis. Models can be fit using either lmer in the 'lme4' package or lme in the 'nlme' package.
Maintained by Adam Loy. Last updated 4 years ago.
17 stars 8.63 score 170 scripts 7 dependentsprojectmosaic
mosaicCalc:R-Language Based Calculus Operations for Teaching
Software to support the introductory *MOSAIC Calculus* textbook <https://www.mosaic-web.org/MOSAIC-Calculus/>), one of many data- and modeling-oriented educational resources developed by Project MOSAIC (<https://www.mosaic-web.org/>). Provides symbolic and numerical differentiation and integration, as well as support for applied linear algebra (for data science), and differential equations/dynamics. Includes grammar-of-graphics-based functions for drawing vector fields, trajectories, etc. The software is suitable for general use, but intended mainly for teaching calculus.
Maintained by Daniel Kaplan. Last updated 1 months ago.
13 stars 8.63 score 546 scriptscharlie86
spotifyr:R Wrapper for the 'Spotify' Web API
An R wrapper for pulling data from the 'Spotify' Web API <https://developer.spotify.com/documentation/web-api/> in bulk, or post items on a 'Spotify' user's playlist.
Maintained by Daniel Antal. Last updated 5 months ago.
music-information-retrievalspotify
375 stars 8.61 score 936 scriptsinsightsengineering
random.cdisc.data:Create Random ADaM Datasets
A set of functions to create random Analysis Data Model (ADaM) datasets and cached dataset. ADaM dataset specifications are described by the Clinical Data Interchange Standards Consortium (CDISC) Analysis Data Model Team.
Maintained by Joe Zhu. Last updated 6 months ago.
33 stars 8.60 score 52 scriptsrobjhyndman
fpp3:Data for "Forecasting: Principles and Practice" (3rd Edition)
All data sets required for the examples and exercises in the book "Forecasting: principles and practice" by Rob J Hyndman and George Athanasopoulos <https://OTexts.com/fpp3/>. All packages required to run the examples are also loaded. Additional data sets not used in the book are also included.
Maintained by Rob Hyndman. Last updated 7 months ago.
142 stars 8.54 score 2.5k scriptsjpquast
protti:Bottom-Up Proteomics and LiP-MS Quality Control and Data Analysis Tools
Useful functions and workflows for proteomics quality control and data analysis of both limited proteolysis-coupled mass spectrometry (LiP-MS) (Feng et. al. (2014) <doi:10.1038/nbt.2999>) and regular bottom-up proteomics experiments. Data generated with search tools such as 'Spectronaut', 'MaxQuant' and 'Proteome Discover' can be easily used due to flexibility of functions.
Maintained by Jan-Philipp Quast. Last updated 5 months ago.
data-analysislip-msmass-spectrometryomicsproteinproteomicssystems-biology
63 stars 8.51 score 83 scriptsropensci
weatherOz:An API Client for Australian Weather and Climate Data Resources
Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
Maintained by Rodrigo Pires. Last updated 1 months ago.
dpirdbommeteorological-dataweather-forecastaustraliaweatherweather-datameteorologywestern-australiaaustralia-bureau-of-meteorologywestern-australia-agricultureaustralia-agricultureaustralia-climateaustralia-weatherapi-clientclimatedatarainfallweather-api
31 stars 8.47 score 40 scriptssamuel-marsh
scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing
Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.
Maintained by Samuel Marsh. Last updated 3 months ago.
customizationggplot2scrna-seqseuratsingle-cellsingle-cell-genomicssingle-cell-rna-seqvisualization
246 stars 8.45 score 1.1k scriptsropensci
weathercan:Download Weather Data from Environment and Climate Change Canada
Provides means for downloading historical weather data from the Environment and Climate Change Canada website (<https://climate.weather.gc.ca/historical_data/search_historic_data_e.html>). Data can be downloaded from multiple stations and over large date ranges and automatically processed into a single dataset. Tools are also provided to identify stations either by name or proximity to a location.
Maintained by Steffi LaZerte. Last updated 7 days ago.
environment-canadapeer-reviewedweather-dataweather-downloader
106 stars 8.45 score 189 scriptstidymodels
tidyposterior:Bayesian Analysis to Compare Models using Resampling Statistics
Bayesian analysis used here to answer the question: "when looking at resampling results, are the differences between models 'real'?" To answer this, a model can be created were the performance statistic is the resampling statistics (e.g. accuracy or RMSE). These values are explained by the model types. In doing this, we can get parameter estimates for each model's affect on performance and make statistical (and practical) comparisons between models. The methods included here are similar to Benavoli et al (2017) <https://jmlr.org/papers/v18/16-305.html>.
Maintained by Max Kuhn. Last updated 6 months ago.
102 stars 8.44 score 273 scriptstidyverts
tsibbledata:Diverse Datasets for 'tsibble'
Provides diverse datasets in the 'tsibble' data structure. These datasets are useful for learning and demonstrating how tidy temporal data can tidied, visualised, and forecasted.
Maintained by Mitchell OHara-Wild. Last updated 5 months ago.
25 stars 8.44 score 740 scripts 2 dependentsdavidhodge931
ggblanket:Simplify 'ggplot2' Visualisation
Simplify 'ggplot2' visualisation with 'ggblanket' wrapper functions.
Maintained by David Hodge. Last updated 1 days ago.
data-visualisationdata-visualizationggplotggplot-extensionggplot2ggplot2-enhancementsvisualisationvisualization
173 stars 8.42 score 45 scriptsdavisvaughan
almanac:Tools for Working with Recurrence Rules
Provides tools for defining recurrence rules and recurrence sets. Recurrence rules are a programmatic way to define a recurring event, like the first Monday of December. Multiple recurrence rules can be combined into larger recurrence sets. A full holiday and calendar interface is also provided that can generate holidays within a particular year, can detect if a date is a holiday, can respect holiday observance rules, and allows for custom holidays.
Maintained by Davis Vaughan. Last updated 2 years ago.
calendarsholidaysrecurrence-rules
73 stars 8.40 score 65 scripts 1 dependentsatfutures
calendar:Create, Read, Write, and Work with 'iCalendar' Files, Calendars and Scheduling Data
Provides function to create, read, write, and work with 'iCalendar' files (which typically have '.ics' or '.ical' extensions), and the scheduling data, calendars and timelines of people, organisations and other entities that they represent. 'iCalendar' is an open standard for exchanging calendar and scheduling information between users and computers, described at <https://icalendar.org/>.
Maintained by Robin Lovelace. Last updated 7 months ago.
42 stars 8.39 score 113 scripts 1 dependentspecanproject
PEcAn.SIPNET:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Mike Dietze. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.37 score 61 scriptstidymodels
finetune:Additional Functions for Model Tuning
The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <arXiv:1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.
Maintained by Max Kuhn. Last updated 8 months ago.
62 stars 8.36 score 704 scripts 1 dependentswallaceecomod
wallace:A Modular Platform for Reproducible Modeling of Species Niches and Distributions
The 'shiny' application Wallace is a modular platform for reproducible modeling of species niches and distributions. Wallace guides users through a complete analysis, from the acquisition of species occurrence and environmental data to visualizing model predictions on an interactive map, thus bundling complex workflows into a single, streamlined interface. An extensive vignette, which guides users through most package functionality can be found on the package's GitHub Pages website: <https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html>.
Maintained by Mary E. Blair. Last updated 25 days ago.
133 stars 8.36 score 96 scriptspecanproject
PEcAn.LINKAGES:PEcAn Package for Integration of the LINKAGES Model
This module provides functions to link the (LINKAGES) to PEcAn.
Maintained by Ann Raiho. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.35 score 59 scriptssmouksassi
ggquickeda:Quickly Explore Your Data Using 'ggplot2' and 'table1' Summary Tables
Quickly and easily perform exploratory data analysis by uploading your data as a 'csv' file. Start generating insights using 'ggplot2' plots and 'table1' tables with descriptive stats, all using an easy-to-use point and click 'Shiny' interface.
Maintained by Samer Mouksassi. Last updated 16 days ago.
73 stars 8.34 score 27 scriptscefet-rj-dal
harbinger:A Unified Time Series Event Detection Framework
By analyzing time series, it is possible to observe significant changes in the behavior of observations that frequently characterize events. Events present themselves as anomalies, change points, or motifs. In the literature, there are several methods for detecting events. However, searching for a suitable time series method is a complex task, especially considering that the nature of events is often unknown. This work presents Harbinger, a framework for integrating and analyzing event detection methods. Harbinger contains several state-of-the-art methods described in Salles et al. (2020) <doi:10.5753/sbbd.2020.13626>.
Maintained by Eduardo Ogasawara. Last updated 4 months ago.
18 stars 8.32 score 216 scriptsbusiness-science
modeltime.ensemble:Ensemble Algorithms for Time Series Forecasting with Modeltime
A 'modeltime' extension that implements time series ensemble forecasting methods including model averaging, weighted averaging, and stacking. These techniques are popular methods to improve forecast accuracy and stability.
Maintained by Matt Dancho. Last updated 9 months ago.
ensembleensemble-learningforecastforecastingmodeltimestackingstacking-ensembletidymodelstimetime-seriestimeseries
77 stars 8.30 score 143 scriptsinsightsengineering
chevron:Standard TLGs for Clinical Trials Reporting
Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.
Maintained by Joe Zhu. Last updated 1 months ago.
clinical-trialsgraphslistingsnestreportingtables
14 stars 8.30 score 12 scriptssurveydown-dev
surveydown:Markdown-Based Surveys Using 'Quarto' and 'shiny'
Generate surveys using markdown and R code chunks. Surveys are composed of two files: a survey.qmd 'Quarto' file defining the survey content (pages, questions, etc), and an app.R file defining a 'shiny' app with global settings (libraries, database configuration, etc.) and server configuration options (e.g., conditional skipping / display, etc.). Survey data collected from respondents is stored in a 'PostgreSQL' database. Features include controls for conditional skip logic (skip to a page based on an answer to a question), conditional display logic (display a question based on an answer to a question), a customizable progress bar, and a wide variety of question types, including multiple choice (single choice and multiple choices), select, text, numeric, multiple choice buttons, text area, and dates. Because the surveys render into a 'shiny' app, designers can also leverage the reactive capabilities of 'shiny' to create dynamic and interactive surveys.
Maintained by John Paul Helveston. Last updated 4 days ago.
markdownpostgrespostgresqlquartoshinyshiny-appsshiny-rsupabasesurveysurveys
97 stars 8.29 score 133 scriptspik-piam
quitte:Bits and pieces of code to use with quitte-style data frames
A collection of functions for easily dealing with quitte-style data frames, doing multi-model comparisons and plots.
Maintained by Falk Benke. Last updated 7 days ago.
8.26 score 184 scripts 35 dependentsradiant-rstats
radiant.data:Data Menu for Radiant: Business Analytics using R and Shiny
The Radiant Data menu includes interfaces for loading, saving, viewing, visualizing, summarizing, transforming, and combining data. It also contains functionality to generate reproducible reports of the analyses conducted in the application.
Maintained by Vincent Nijs. Last updated 5 months ago.
53 stars 8.25 score 146 scripts 6 dependentsopenvolley
datavolley:Reading and Analyzing DataVolley Scout Files
Provides functions for parsing and working with volleyball match files in DataVolley format.
Maintained by Ben Raymond. Last updated 2 months ago.
openvolleysports-analyticsvolleyball
31 stars 8.24 score 94 scripts 11 dependentsr-dbi
DBItest:Testing DBI Backends
A helper that tests DBI back ends for conformity to the interface.
Maintained by Kirill Mรผller. Last updated 15 days ago.
24 stars 8.21 score 11 scriptsrobjhyndman
demography:Forecasting Mortality, Fertility, Migration and Population Data
Functions for demographic analysis including lifetable calculations; Lee-Carter modelling; functional data analysis of mortality rates, fertility rates, net migration numbers; and stochastic population forecasting.
Maintained by Rob Hyndman. Last updated 4 months ago.
actuarialdemographyforecasting
74 stars 8.21 score 241 scripts 6 dependentssticsrpacks
SticsRFiles:Read and Modify 'STICS' Input/Output Files
Manipulating input and output files of the 'STICS' crop model. Files are either 'JavaSTICS' XML files or text files used by the model 'fortran' executable. Most basic functionalities are reading or writing parameter names and values in both XML or text input files, and getting data from output files. Advanced functionalities include XML files generation from XML templates and/or spreadsheets, or text files generation from XML files by using 'xslt' transformation.
Maintained by Patrice Lecharpentier. Last updated 1 months ago.
4 stars 8.21 score 124 scriptsdarwin-eu
DrugUtilisation:Summarise Patient-Level Drug Utilisation in Data Mapped to the OMOP Common Data Model
Summarise patient-level drug utilisation cohorts using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. New users and prevalent users cohorts can be generated and their characteristics, indication and drug use summarised.
Maintained by Martรญ Catalร . Last updated 2 months ago.
8.20 score 156 scripts 2 dependentsdavidsjoberg
hablar:Non-Astonishing Results in R
Simple tools for converting columns to new data types. Intuitive functions for columns with missing values.
Maintained by David Sjoberg. Last updated 2 years ago.
59 stars 8.20 score 468 scriptsnixtla
nixtlar:A Software Development Kit for 'Nixtla''s 'TimeGPT'
A Software Development Kit for working with 'Nixtla''s 'TimeGPT', a foundation model for time series forecasting. 'API' is an acronym for 'application programming interface'; this package allows users to interact with 'TimeGPT' via the 'API'. You can set and validate 'API' keys and generate forecasts via 'API' calls. It is compatible with 'tsibble' and base R. For more details visit <https://docs.nixtla.io/>.
Maintained by Mariana Menchero. Last updated 1 months ago.
32 stars 8.19 score 38 scriptsbioc
POMA:Tools for Omics Data Analysis
The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.
Maintained by Pol Castellano-Escuder. Last updated 4 months ago.
batcheffectclassificationclusteringdecisiontreedimensionreductionmultidimensionalscalingnormalizationpreprocessingprincipalcomponentregressionrnaseqsoftwarestatisticalmethodvisualizationbioconductorbioinformaticsdata-visualizationdimension-reductionexploratory-data-analysismachine-learningomics-data-integrationpipelinepre-processingstatistical-analysisuser-friendlyworkflow
11 stars 8.16 score 20 scripts 1 dependentsaphalo
ggspectra:Extensions to 'ggplot2' for Radiation Spectra
Additional annotations, stats, geoms and scales for plotting "light" spectra with 'ggplot2', together with specializations of ggplot() and autoplot() methods for spectral data and waveband definitions stored in objects of classes defined in package 'photobiology'. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.
Maintained by Pedro J. Aphalo. Last updated 2 days ago.
datavizggplot2-autoplotggplot2-enhancementesggplot2-geomsggplot2-scalesggplot2-statslightr4photobiology-suiteradiationspectra
5 stars 8.15 score 390 scripts 1 dependentsrobjhyndman
cricketdata:International Cricket Data
Data on international and other major cricket matches from ESPNCricinfo <https://www.espncricinfo.com> and Cricsheet <https://cricsheet.org>. This package provides some functions to download the data into tibbles ready for analysis.
Maintained by Rob Hyndman. Last updated 7 days ago.
cricketcricket-dataozunconf17unconf
88 stars 8.14 score 87 scriptsnceas
metajam:Easily Download Data and Metadata from 'DataONE'
A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (<https://www.dataone.org>) and easily importing this information into R.
Maintained by Julien Brun. Last updated 7 months ago.
datadata-analysismetadatarepositories
16 stars 8.13 score 75 scriptspecanproject
PEcAnAssimSequential:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Mike Dietze. Last updated 5 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.13 score 35 scriptsevolecolgroup
pastclim:Manipulate Time Series of Climate Reconstructions
Methods to easily extract and manipulate climate reconstructions for ecological and anthropological analyses, as described in Leonardi et al. (2023) <doi:10.1111/ecog.06481>. The package includes datasets of palaeoclimate reconstructions, present observations, and future projections from multiple climate models.
Maintained by Andrea Manica. Last updated 19 days ago.
climate-datapaleoclimatespecies-distribution-modelling
38 stars 8.12 score 49 scriptspik-piam
mip:Comparison of multi-model runs
Package contains generic functions to produce comparison plots of multi-model runs.
Maintained by David Klein. Last updated 15 hours ago.
1 stars 8.11 score 70 scripts 21 dependentsramiromagno
gwasrapidd:'REST' 'API' Client for the 'NHGRI'-'EBI' 'GWAS' Catalog
'GWAS' R 'API' Data Download. This package provides easy access to the 'NHGRI'-'EBI' 'GWAS' Catalog data by accessing the 'REST' 'API' <https://www.ebi.ac.uk/gwas/rest/docs/api/>.
Maintained by Ramiro Magno. Last updated 1 years ago.
thirdpartyclientbiomedicalinformaticsgenomewideassociationsnpassociation-studiesgwas-cataloghumanrest-clienttraittrait-ontology
95 stars 8.10 score 49 scripts 1 dependentstychobra
polished:Authentication and Hosting for 'shiny' Apps
Authentication, user administration, hosting, and additional infrastructure for 'shiny' apps. See <https://polished.tech> for additional documentation and examples.
Maintained by Andy Merlino. Last updated 12 days ago.
233 stars 8.09 score 75 scriptschop-cgtinformatics
REDCapTidieR:Extract 'REDCap' Databases into Tidy 'Tibble's
Convert 'REDCap' exports into tidy tables for easy handling of 'REDCap' repeat instruments and event arms.
Maintained by Richard Hanna. Last updated 11 days ago.
35 stars 8.08 score 36 scriptsluomus
finbif:Interface for the 'Finnish Biodiversity Information Facility' API
A programmatic interface to the 'Finnish Biodiversity Information Facility' ('FinBIF') API (<https://api.laji.fi>). 'FinBIF' aggregates Finnish biodiversity data from multiple sources in a single open access portal for researchers, citizen scientists, industry and government. 'FinBIF' allows users of biodiversity information to find, access, combine and visualise data on Finnish plants, animals and microorganisms. The 'finbif' package makes the publicly available data in 'FinBIF' easily accessible to programmers. Biodiversity information is available on taxonomy and taxon occurrence. Occurrence data can be filtered by taxon, time, location and other variables. The data accessed are conveniently preformatted for subsequent analyses.
Maintained by William K. Morris. Last updated 12 days ago.
apibiodiversitybiodiversity-informaticsbiodiversity-informationfinbiffinbif-accessoccurrencesr-programmingspeciesspecimenstaxontaxonomyweb-services
5 stars 8.07 score 42 scripts 3 dependentsbayes-rules
bayesrules:Datasets and Supplemental Functions from Bayes Rules! Book
Provides datasets and functions used for analysis and visualizations in the Bayes Rules! book (<https://www.bayesrulesbook.com>). The package contains a set of functions that summarize and plot Bayesian models from some conjugate families and another set of functions for evaluation of some Bayesian models.
Maintained by Mine Dogucu. Last updated 3 years ago.
72 stars 8.06 score 466 scriptscran
epiR:Tools for the Analysis of Epidemiological Data
Tools for the analysis of epidemiological and surveillance data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, computation of confidence intervals around incidence risk and incidence rate estimates and sample size calculations for cross-sectional, case-control and cohort studies. Surveillance tools include functions to calculate an appropriate sample size for 1- and 2-stage representative freedom surveys, functions to estimate surveillance system sensitivity and functions to support scenario tree modelling analyses.
Maintained by Mark Stevenson. Last updated 2 months ago.
10 stars 8.06 score 10 dependentsradiant-rstats
radiant:Business Analytics using R and Shiny
A platform-independent browser-based interface for business analytics in R, based on the shiny package. The application combines the functionality of 'radiant.data', 'radiant.design', 'radiant.basics', 'radiant.model', and 'radiant.multivariate'.
Maintained by Vincent Nijs. Last updated 11 months ago.
460 stars 8.02 score 228 scriptsmazamascience
MazamaSpatialUtils:Spatial Data Download and Utility Functions
A suite of conversion functions to create internally standardized spatial polygons data frames. Utility functions use these data sets to return values such as country, state, time zone, watershed, etc. associated with a set of longitude/latitude pairs. (They also make cool maps.)
Maintained by Jonathan Callahan. Last updated 5 months ago.
5 stars 8.01 score 282 scripts 2 dependentsscasanova
f1dataR:Access Formula 1 Data
Obtain Formula 1 data via the 'Jolpica API' <https://jolpi.ca> and the unofficial API <https://www.formula1.com/en/timing/f1-live> via the 'fastf1' 'Python' library <https://docs.fastf1.dev/>.
Maintained by Santiago Casanova. Last updated 5 days ago.
60 stars 7.97 score 26 scriptsropensci
osmplotr:Bespoke Images of 'OpenStreetMap' Data
Bespoke images of 'OpenStreetMap' ('OSM') data and data visualisation using 'OSM' objects.
Maintained by Mark Padgham. Last updated 1 months ago.
data-visualisationhighlighting-clustersopenstreetmaposmoverpassoverpass-apipeer-reviewed
139 stars 7.97 score 80 scriptsbrian-j-smith
MachineShop:Machine Learning Models and Tools
Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
Maintained by Brian J Smith. Last updated 7 months ago.
classification-modelsmachine-learningpredictive-modelingregression-modelssurvival-models
62 stars 7.95 score 121 scriptspharmaverse
admiralophtha:ADaM in R Asset Library - Ophthalmology
Aids the programming of Clinical Data Standards Interchange Consortium (CDISC) compliant Ophthalmology Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam/adamig-v1-3-release-package>).
Maintained by Edoardo Mancini. Last updated 3 months ago.
15 stars 7.94 score 10 scriptsbcallaway11
BMisc:Miscellaneous Functions for Panel Data, Quantiles, and Printing Results
These are miscellaneous functions for working with panel data, quantiles, and printing results. For panel data, the package includes functions for making a panel data balanced (that is, dropping missing individuals that have missing observations in any time period), converting id numbers to row numbers, and to treat repeated cross sections as panel data under the assumption of rank invariance. For quantiles, there are functions to make distribution functions from a set of data points (this is particularly useful when a distribution function is created in several steps), to combine distribution functions based on some external weights, and to invert distribution functions. Finally, there are several other miscellaneous functions for obtaining weighted means, weighted distribution functions, and weighted quantiles; to generate summary statistics and their differences for two groups; and to add or drop covariates from formulas.
Maintained by Brantly Callaway. Last updated 2 months ago.
7 stars 7.92 score 110 scripts 8 dependentsropenspain
spanishoddata:Get Spanish Origin-Destination Data
Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the 'duckdb' database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.
Maintained by Egor Kotov. Last updated 11 days ago.
cdrdatadata-packagemobile-telephone-datamobilityorigin-destination
35 stars 7.92 score 14 scriptstlverse
tmle3:The Extensible TMLE Framework
A general framework supporting the implementation of targeted maximum likelihood estimators (TMLEs) of a diverse range of statistical target parameters through a unified interface. The goal is that the exposed framework be as general as the mathematical framework upon which it draws.
Maintained by Jeremy Coyle. Last updated 5 months ago.
causal-inferencemachine-learningtargeted-learningvariable-importance
38 stars 7.91 score 286 scripts 5 dependentsohdsi
CohortGenerator:Cohort Generation for the OMOP Common Data Model
Generate cohorts and subsets using an Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) Database. Cohorts are defined using 'CIRCE' (<https://github.com/ohdsi/circe-be>) or SQL compatible with 'SqlRender' (<https://github.com/OHDSI/SqlRender>).
Maintained by Anthony Sena. Last updated 6 months ago.
13 stars 7.91 score 165 scriptspik-piam
magpie4:MAgPIE outputs R package for MAgPIE version 4.x
Common output routines for extracting results from the MAgPIE framework (versions 4.x).
Maintained by Benjamin Leon Bodirsky. Last updated 20 hours ago.
2 stars 7.90 score 254 scripts 9 dependentsmyles-lewis
nestedcv:Nested Cross-Validation with 'glmnet' and 'caret'
Implements nested k*l-fold cross-validation for lasso and elastic-net regularised linear models via the 'glmnet' package and other machine learning models via the 'caret' package <doi:10.1093/bioadv/vbad048>. Cross-validation of 'glmnet' alpha mixing parameter and embedded fast filter functions for feature selection are provided. Described as double cross-validation by Stone (1977) <doi:10.1111/j.2517-6161.1977.tb01603.x>. Also implemented is a method using outer CV to measure unbiased model performance metrics when fitting Bayesian linear and logistic regression shrinkage models using the horseshoe prior over parameters to encourage a sparse model as described by Piironen & Vehtari (2017) <doi:10.1214/17-EJS1337SI>.
Maintained by Myles Lewis. Last updated 12 days ago.
12 stars 7.90 score 46 scripts