R-universe search: needs:lubridate

tidyverse

tidyverse:Easily Install and Load the 'Tidyverse'

The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at <https://www.tidyverse.org>.

Maintained by Hadley Wickham. Last updated 5 months ago.

data-science tidyverse

1.7k stars 20.23 score 664k scripts 125 dependents

sfirke

janitor:Simple Tools for Examining and Cleaning Dirty Data

The main janitor functions can: perfectly format data.frame column names; provide quick counts of variable combinations (i.e., frequency tables and crosstabs); and explore duplicate records. Other janitor functions nicely format the tabulation results. These tabulate-and-report functions approximate popular features of SPSS and Microsoft Excel. This package follows the principles of the "tidyverse" and works well with the pipe function %>%. janitor was built with beginning-to-intermediate R users in mind and is optimized for user-friendliness.

Maintained by Sam Firke. Last updated 3 months ago.

data-analysis data-cleaning data-science dirty-data excel pivot-tables spss tabulations tidyverse

1.4k stars 19.40 score 35k scripts 231 dependents

topepo

caret:Classification and Regression Training

Misc functions for training and plotting classification and regression models.

Maintained by Max Kuhn. Last updated 4 months ago.

1.6k stars 19.24 score 61k scripts 303 dependents

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 3 hours ago.

586 stars 18.79 score 7.2k scripts 381 dependents

tidymodels

tidymodels:Easily Install and Load the 'Tidymodels' Packages

The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.

Maintained by Max Kuhn. Last updated 1 months ago.

783 stars 16.52 score 66k scripts 15 dependents

facebook

prophet:Automatic Forecasting Procedure

Implements a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

Maintained by Sean Taylor. Last updated 5 months ago.

forecasting python cpp

19k stars 15.59 score 976 scripts 13 dependents

r-dbi

RPostgres:C++ Interface to PostgreSQL

Fully DBI-compliant C++-backed interface to PostgreSQL <https://www.postgresql.org/>, an open-source relational database.

Maintained by Kirill Müller. Last updated 1 months ago.

database postgres postgresql cpp

338 stars 14.78 score 1.6k scripts 31 dependents

dcomtois

summarytools:Tools to Quickly and Neatly Summarize Data

Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.

Maintained by Dominic Comtois. Last updated 1 hours ago.

descriptive-statistics frequency-table html-report markdown pander pandoc pandoc-markdown rmarkdown rstudio

528 stars 14.69 score 2.9k scripts 6 dependents

ropensci

osmdata:Import 'OpenStreetMap' Data as Simple Features or Spatial Objects

Download and import of 'OpenStreetMap' ('OSM') data as 'sf' or 'sp' objects. 'OSM' data are extracted from the 'Overpass' web server (<https://overpass-api.de/>) and processed with very fast 'C++' routines for return to 'R'.

Maintained by Mark Padgham. Last updated 2 months ago.

open0street0map openstreetmap overpass0api osm cpp osm-data overpass-api peer-reviewed cpp

322 stars 14.53 score 2.8k scripts 14 dependents

tidyverts

tsibble:Tidy Temporal Data Frames and Tools

Provides a 'tbl_ts' class (the 'tsibble') for temporal data in an data- and model-oriented format. The 'tsibble' provides tools to easily manipulate and analyse temporal data, such as filling in time gaps and aggregating over calendar periods.

Maintained by Earo Wang. Last updated 2 months ago.

536 stars 14.48 score 4.4k scripts 42 dependents

tidymodels

tune:Tidy Tuning Tools

The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps.

Maintained by Max Kuhn. Last updated 28 days ago.

293 stars 14.27 score 756 scripts 39 dependents

business-science

timetk:A Tool Kit for Working with Time Series

Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.

Maintained by Matt Dancho. Last updated 1 years ago.

coercion coercion-functions data-mining dplyr forecast forecasting forecasting-models machine-learning series-decomposition series-signature tibble tidy tidyquant tidyverse time time-series timeseries

626 stars 14.20 score 4.0k scripts 16 dependents

doi-usgs

dataRetrieval:Retrieval Functions for USGS and EPA Hydrology and Water Quality Data

Collection of functions to help retrieve U.S. Geological Survey and U.S. Environmental Protection Agency water quality and hydrology data from web services. Data are discovered from National Water Information System <https://waterservices.usgs.gov/> and <https://waterdata.usgs.gov/nwis>. Water quality data are obtained from the Water Quality Portal <https://www.waterqualitydata.us/>.

Maintained by Laura DeCicco. Last updated 5 days ago.

usgs

286 stars 14.16 score 1.7k scripts 15 dependents

pharmaverse

admiral:ADaM in R Asset Library

A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).

Maintained by Ben Straub. Last updated 7 days ago.

cdisc clinical-trials open-source

239 stars 13.97 score 486 scripts 4 dependents

tidymodels

workflows:Modeling Workflows

Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.

Maintained by Simon Couch. Last updated 1 months ago.

207 stars 13.97 score 876 scripts 43 dependents

jbkunst

highcharter:A Wrapper for the 'Highcharts' Library

A wrapper for the 'Highcharts' library including shortcut functions to plot R objects. 'Highcharts' <https://www.highcharts.com/> is a charting library offering numerous chart types with a simple configuration syntax.

Maintained by Joshua Kunst. Last updated 1 years ago.

highcharts htmlwidgets shiny shiny-r visualization wrapper

725 stars 13.93 score 4.9k scripts 18 dependents

aphalo

ggpmisc:Miscellaneous Extensions to 'ggplot2'

Extensions to 'ggplot2' respecting the grammar of graphics paradigm. Statistics: locate and tag peaks and valleys; label plot with the equation of a fitted polynomial or other types of models; labels with P-value, R^2 or adjusted R^2 or information criteria for fitted models; label with ANOVA table for fitted models; label with summary for fitted models. Model fit classes for which suitable methods are provided by package 'broom' and 'broom.mixed' are supported. Scales and stats to build volcano and quadrant plots based on outcomes, fold changes, p-values and false discovery rates.

Maintained by Pedro J. Aphalo. Last updated 2 days ago.

data-analysis dataviz ggplot2-annotations ggplot2-stats statistics

107 stars 13.64 score 4.4k scripts 14 dependents

tidyverts

fable:Forecasting Models for Tidy Time Series

Provides a collection of commonly used univariate and multivariate time series forecasting models including automatically selected exponential smoothing (ETS) and autoregressive integrated moving average (ARIMA) models. These models work within the 'fable' framework provided by the 'fabletools' package, which provides the tools to evaluate, visualise, and combine models in a workflow consistent with the tidyverse.

Maintained by Mitchell OHara-Wild. Last updated 4 months ago.

forecasting cpp

569 stars 13.54 score 2.1k scripts 6 dependents

business-science

tidyquant:Tidy Quantitative Financial Analysis

Bringing business and financial analysis to the 'tidyverse'. The 'tidyquant' package provides a convenient wrapper to various 'xts', 'zoo', 'quantmod', 'TTR' and 'PerformanceAnalytics' package functions and returns the objects in the tidy 'tibble' format. The main advantage is being able to use quantitative functions with the 'tidyverse' functions including 'purrr', 'dplyr', 'tidyr', 'ggplot2', 'lubridate', etc. See the 'tidyquant' website for more information, documentation and examples.

Maintained by Matt Dancho. Last updated 2 months ago.

dplyr financial-analysis financial-data financial-statements multiple-stocks performance-analysis performanceanalytics quantmod stock stock-exchanges stock-indexes stock-lists stock-performance stock-prices stock-symbol tidyverse time-series timeseries xts

872 stars 13.34 score 5.2k scripts

oscarkjell

text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning

Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.

Maintained by Oscar Kjell. Last updated 10 days ago.

deep-learning machine-learning nlp transformers openjdk

145 stars 13.21 score 436 scripts 1 dependents

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 18 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

109 stars 13.20 score 342 scripts 3 dependents

openair-project

openair:Tools for the Analysis of Air Pollution Data

Tools to analyse, interpret and understand air pollution data. Data are typically regular time series and air quality measurement, meteorological data and dispersion model output can be analysed. The package is described in Carslaw and Ropkins (2012, <doi:10.1016/j.envsoft.2011.09.008>) and subsequent papers.

Maintained by David Carslaw. Last updated 4 days ago.

air-quality air-quality-data meteorology openair cpp

316 stars 12.94 score 1.2k scripts 12 dependents

aphalo

ggpp:Grammar Extensions to 'ggplot2'

Extensions to 'ggplot2' respecting the grammar of graphics paradigm. Geometries: geom_table(), geom_plot() and geom_grob() add insets to plots using native data coordinates, while geom_table_npc(), geom_plot_npc() and geom_grob_npc() do the same using "npc" coordinates through new aesthetics "npcx" and "npcy". Statistics: select observations based on 2D density. Positions: radial nudging away from a center point and nudging away from a line or curve; combined stacking and nudging; combined dodging and nudging.

Maintained by Pedro J. Aphalo. Last updated 1 months ago.

data-labels dataviz ggplot2-enhancements ggplot2-geoms ggplot2-insets ggplot2-positions

129 stars 12.53 score 582 scripts 26 dependents

tidyverts

feasts:Feature Extraction and Statistics for Time Series

Provides a collection of features, decomposition methods, statistical summaries and graphics functions for the analysing tidy time series data. The package name 'feasts' is an acronym comprising of its key features: Feature Extraction And Statistics for Time Series.

Maintained by Mitchell OHara-Wild. Last updated 5 months ago.

300 stars 12.38 score 1.4k scripts 7 dependents

eliocamp

metR:Tools for Easier Analysis of Meteorological Fields

Many useful functions and extensions for dealing with meteorological data in the tidy data framework. Extends 'ggplot2' for better plotting of scalar and vector fields and provides commonly used analysis methods in the atmospheric sciences.

Maintained by Elio Campitelli. Last updated 12 days ago.

atmospheric-science ggplot2 visualization

146 stars 12.30 score 1000 scripts 22 dependents

r-dbi

RMariaDB:Database Interface and MariaDB Driver

Implements a DBI-compliant interface to MariaDB (<https://mariadb.org/>) and MySQL (<https://www.mysql.com/>) databases.

Maintained by Kirill Müller. Last updated 1 months ago.

database mariadb mysql cpp

133 stars 12.20 score 792 scripts 10 dependents

tidyverts

fabletools:Core Tools for Packages in the 'fable' Framework

Provides tools, helpers and data structures for developing models and time series functions for 'fable' and extension packages. These tools support a consistent and tidy interface for time series modelling and analysis.

Maintained by Mitchell OHara-Wild. Last updated 2 months ago.

91 stars 12.18 score 396 scripts 18 dependents

bcallaway11

did:Treatment Effects with Multiple Periods and Groups

The standard Difference-in-Differences (DID) setup involves two periods and two groups -- a treated group and untreated group. Many applications of DID methods involve more than two periods and have individuals that are treated at different points in time. This package contains tools for computing average treatment effect parameters in Difference in Differences setups with more than two periods and with variation in treatment timing using the methods developed in Callaway and Sant'Anna (2021) <doi:10.1016/j.jeconom.2020.12.001>. The main parameters are group-time average treatment effects which are the average treatment effect for a particular group at a a particular time. These can be aggregated into a fewer number of treatment effect parameters, and the package deals with the cases where there is selective treatment timing, dynamic treatment effects, calendar time effects, or combinations of these. There are also functions for testing the Difference in Differences assumption, and plotting group-time average treatment effects.

Maintained by Brantly Callaway. Last updated 5 days ago.

329 stars 12.09 score 696 scripts 3 dependents

tidymodels

probably:Tools for Post-Processing Predicted Values

Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.

Maintained by Max Kuhn. Last updated 6 months ago.

115 stars 12.09 score 21k scripts 1 dependents

ropensci

RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management

Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.

Maintained by Mathew W. McLean. Last updated 4 months ago.

peer-reviewed

115 stars 12.06 score 2.3k scripts 16 dependents

tidymodels

workflowsets:Create a Collection of 'tidymodels' Workflows

A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.

Maintained by Simon Couch. Last updated 5 months ago.

94 stars 12.04 score 294 scripts 19 dependents

zachmayer

caretEnsemble:Ensembles of Caret Models

Functions for creating ensembles of caret models: caretList() and caretStack(). caretList() is a convenience function for fitting multiple caret::train() models to the same dataset. caretStack() will make linear or non-linear combinations of these models, using a caret::train() model as a meta-model.

Maintained by Zachary A. Deane-Mayer. Last updated 3 months ago.

226 stars 11.98 score 780 scripts 1 dependents

pecanproject

PEcAn.DB:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 11.90 score 127 scripts 27 dependents

epiforecasts

EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters

Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.

Maintained by Sebastian Funk. Last updated 1 months ago.

backcalculation covid-19 gaussian-processes open-source reproduction-number stan cpp

123 stars 11.86 score 210 scripts

hannameyer

CAST:'caret' Applications for Spatial-Temporal Models

Supporting functionality to run 'caret' with spatial or spatial-temporal data. 'caret' is a frequently used package for model training and prediction using machine learning. CAST includes functions to improve spatial or spatial-temporal modelling tasks using 'caret'. It includes the newly suggested 'Nearest neighbor distance matching' cross-validation to estimate the performance of spatial prediction models and allows for spatial variable selection to selects suitable predictor variables in view to their contribution to the spatial model performance. CAST further includes functionality to estimate the (spatial) area of applicability of prediction models. Methods are described in Meyer et al. (2018) <doi:10.1016/j.envsoft.2017.12.001>; Meyer et al. (2019) <doi:10.1016/j.ecolmodel.2019.108815>; Meyer and Pebesma (2021) <doi:10.1111/2041-210X.13650>; Milà et al. (2022) <doi:10.1111/2041-210X.13851>; Meyer and Pebesma (2022) <doi:10.1038/s41467-022-29838-9>; Linnenbrink et al. (2023) <doi:10.5194/egusphere-2023-1308>; Schumacher et al. (2024) <doi:10.5194/egusphere-2024-2730>. The package is described in detail in Meyer et al. (2024) <doi:10.48550/arXiv.2404.06978>.

Maintained by Hanna Meyer. Last updated 2 months ago.

autocorrelation caret feature-selection machine-learning overfitting predictive-modeling spatial spatio-temporal variable-selection

114 stars 11.85 score 298 scripts 1 dependents

pecanproject

PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 11.62 score 64 scripts 14 dependents

edwinth

padr:Quickly Get Datetime Data Ready for Analysis

Transforms datetime data into a format ready for analysis. It offers two core functionalities; aggregating data to a higher level interval (thicken) and imputing records where observations were absent (pad).

Maintained by Edwin Thoen. Last updated 4 months ago.

cpp

132 stars 11.55 score 428 scripts 20 dependents

urbananalyst

dodgr:Distances on Directed Graphs

Distances on dual-weighted directed graphs using priority-queue shortest paths (Padgham (2019) <doi:10.32866/6945>). Weighted directed graphs have weights from A to B which may differ from those from B to A. Dual-weighted directed graphs have two sets of such weights. A canonical example is a street network to be used for routing in which routes are calculated by weighting distances according to the type of way and mode of transport, yet lengths of routes must be calculated from direct distances.

Maintained by Mark Padgham. Last updated 4 days ago.

distance openstreetmap router shortest-paths street-networks cpp

129 stars 11.52 score 229 scripts 4 dependents

tidymodels

stacks:Tidy Model Stacking

Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.

Maintained by Simon Couch. Last updated 5 months ago.

298 stars 11.46 score 840 scripts

doi-usgs

nhdplusTools:NHDPlus Tools

Tools for traversing and working with National Hydrography Dataset Plus (NHDPlus) data. All methods implemented in 'nhdplusTools' are available in the NHDPlus documentation available from the US Environmental Protection Agency <https://www.epa.gov/waterdata/basic-information>.

Maintained by David Blodgett. Last updated 1 months ago.

87 stars 11.38 score 348 scripts 5 dependents

moderndive

moderndive:Tidyverse-Friendly Introductory Linear Regression

Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.

Maintained by Albert Y. Kim. Last updated 3 months ago.

88 stars 11.32 score 1.8k scripts

jamiemkass

ENMeval:Automated Tuning and Evaluations of Ecological Niche Models

Runs ecological niche models over all combinations of user-defined settings (i.e., tuning), performs cross validation to evaluate models, and returns data tables to aid in selection of optimal model settings that balance goodness-of-fit and model complexity. Also has functions to partition data spatially (or not) for cross validation, to plot multiple visualizations of results, to run null models to estimate significance and effect sizes of performance metrics, and to calculate range overlap between model predictions, among others. The package was originally built for Maxent models (Phillips et al. 2006, Phillips et al. 2017), but the current version allows possible extensions for any modeling algorithm. The extensive vignette, which guides users through most package functionality but unfortunately has a file size too big for CRAN, can be found here on the package's Github Pages website: <https://jamiemkass.github.io/ENMeval/articles/ENMeval-2.0-vignette.html>.

Maintained by Jamie M. Kass. Last updated 3 days ago.

49 stars 11.16 score 332 scripts 2 dependents

ropengov

eurostat:Tools for Eurostat Open Data

Tools to download data from the Eurostat database <https://ec.europa.eu/eurostat> together with search and manipulation utilities.

Maintained by Leo Lahti. Last updated 1 months ago.

ropengov eurostat eurostat-data

242 stars 11.07 score 892 scripts 4 dependents

earthyscience

REddyProc:Post Processing of (Half-)Hourly Eddy-Covariance Measurements

Standard and extensible Eddy-Covariance data post-processing (Wutzler et al. (2018) <doi:10.5194/bg-15-5015-2018>) includes uStar-filtering, gap-filling, and flux-partitioning. The Eddy-Covariance (EC) micrometeorological technique quantifies continuous exchange fluxes of gases, energy, and momentum between an ecosystem and the atmosphere. It is important for understanding ecosystem dynamics and upscaling exchange fluxes. (Aubinet et al. (2012) <doi:10.1007/978-94-007-2351-1>). This package inputs pre-processed (half-)hourly data and supports further processing. First, a quality-check and filtering is performed based on the relationship between measured flux and friction velocity (uStar) to discard biased data (Papale et al. (2006) <doi:10.5194/bg-3-571-2006>). Second, gaps in the data are filled based on information from environmental conditions (Reichstein et al. (2005) <doi:10.1111/j.1365-2486.2005.001002.x>). Third, the net flux of carbon dioxide is partitioned into its gross fluxes in and out of the ecosystem by night-time based and day-time based approaches (Lasslop et al. (2010) <doi:10.1111/j.1365-2486.2009.02041.x>).

Maintained by Thomas Wutzler. Last updated 4 months ago.

cpp

63 stars 11.04 score 163 scripts 16 dependents

pecanproject

PEcAn.utils:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Rob Kooper. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 10.94 score 218 scripts 35 dependents

tidymodels

textrecipes:Extra 'Recipes' for Text Processing

Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.

Maintained by Emil Hvitfeldt. Last updated 13 days ago.

160 stars 10.86 score 964 scripts 1 dependents

pecanproject

PEcAn.benchmark:PEcAn Functions Used for Benchmarking

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.benchmark package provides utilities for comparing models and data, including a suite of statistical metrics and plots.

Maintained by Mike Dietze. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 10.72 score 416 scripts 11 dependents

jimmyday12

fitzRoy:Easily Scrape and Process AFL Data

An easy package for scraping and processing Australia Rules Football (AFL) data. 'fitzRoy' provides a range of functions for accessing publicly available data from 'AFL Tables' <https://afltables.com/afl/afl_index.html>, 'Footy Wire' <https://www.footywire.com> and 'The Squiggle' <https://squiggle.com.au>. Further functions allow for easy processing, cleaning and transformation of this data into formats that can be used for analysis.

Maintained by James Day. Last updated 11 days ago.

136 stars 10.72 score 324 scripts

doi-usgs

EGRET:Exploration and Graphics for RivEr Trends

Statistics and graphics for streamflow history, water quality trends, and the statistical modeling algorithm: Weighted Regressions on Time, Discharge, and Season (WRTDS).

Maintained by Laura DeCicco. Last updated 4 months ago.

usgs water-quality water-quality-data

90 stars 10.67 score 362 scripts 1 dependents

business-science

modeltime:The Tidymodels Extension for Time Series Modeling

The time series forecasting framework for use with the 'tidymodels' ecosystem. Models include ARIMA, Exponential Smoothing, and additional time series models from the 'forecast' and 'prophet' packages. Refer to "Forecasting Principles & Practice, Second edition" (<https://otexts.com/fpp2/>). Refer to "Prophet: forecasting at scale" (<https://research.facebook.com/blog/2017/02/prophet-forecasting-at-scale/>.).

Maintained by Matt Dancho. Last updated 5 months ago.

arima data-science deep-learning ets forecasting machine-learning machine-learning-algorithms modeltime prophet tbats tidymodeling tidymodels time time-series time-series-analysis timeseries timeseries-forecasting

551 stars 10.61 score 1.1k scripts 7 dependents

jmsigner

amt:Animal Movement Tools

Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.

Maintained by Johannes Signer. Last updated 5 months ago.

41 stars 10.54 score 418 scripts

business-science

tibbletime:Time Aware Tibbles

Built on top of the 'tibble' package, 'tibbletime' is an extension that allows for the creation of time aware tibbles. Some immediate advantages of this include: the ability to perform time-based subsetting on tibbles, quickly summarising and aggregating results by time periods, and creating columns that can be used as 'dplyr' time-based groups.

Maintained by Davis Vaughan. Last updated 4 months ago.

periodicity tibble time time-series timeseries cpp

177 stars 10.51 score 644 scripts 2 dependents

vubiostat

redcapAPI:Interface to 'REDCap'

Access data stored in 'REDCap' databases using the Application Programming Interface (API). 'REDCap' (Research Electronic Data CAPture; <https://projectredcap.org>, Harris, et al. (2009) <doi:10.1016/j.jbi.2008.08.010>, Harris, et al. (2019) <doi:10.1016/j.jbi.2019.103208>) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The API allows users to access data and project meta data (such as the data dictionary) from the web programmatically. The 'redcapAPI' package facilitates the process of accessing data with options to prepare an analysis-ready data set consistent with the definitions in a database's data dictionary.

Maintained by Shawn Garbett. Last updated 25 days ago.

22 stars 10.47 score 134 scripts 2 dependents

petersonr

bestNormalize:Normalizing Transformation Functions

Estimate a suite of normalizing transformations, including a new adaptation of a technique based on ranks which can guarantee normally distributed transformed data if there are no ties: ordered quantile normalization (ORQ). ORQ normalization combines a rank-mapping approach with a shifted logit approximation that allows the transformation to work on data outside the original domain. It is also able to handle new data within the original domain via linear interpolation. The package is built to estimate the best normalizing transformation for a vector consistently and accurately. It implements the Box-Cox transformation, the Yeo-Johnson transformation, three types of Lambert WxF transformations, and the ordered quantile normalization transformation. It estimates the normalization efficacy of other commonly used transformations, and it allows users to specify custom transformations or normalization statistics. Finally, functionality can be integrated into a machine learning workflow via recipes.

Maintained by Ryan Andrew Peterson. Last updated 1 years ago.

39 stars 10.45 score 510 scripts 5 dependents

ropensci

rerddap:General Purpose Client for 'ERDDAP™' Servers

General purpose R client for 'ERDDAP™' servers. Includes functions to search for 'datasets', get summary information on 'datasets', and fetch 'datasets', in either 'csv' or 'netCDF' format. 'ERDDAP™' information: <https://upwell.pfeg.noaa.gov/erddap/information.html>.

Maintained by Roy Mendelssohn. Last updated 12 days ago.

earth science climate precipitation temperature storm buoy noaa api-client erddap noaa-data

41 stars 10.43 score 376 scripts 5 dependents

tidymodels

themis:Extra Recipes Steps for Dealing with Unbalanced Data

A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <doi:10.48550/arXiv.1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.

Maintained by Emil Hvitfeldt. Last updated 2 months ago.

143 stars 10.37 score 1.3k scripts 2 dependents

gforge

Gmisc:Descriptive Statistics, Transition Plots, and More

Tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bézier lines with arrows complementing the ones in the 'grid' package, and more.

Maintained by Max Gordon. Last updated 2 years ago.

cpp

51 stars 10.34 score 233 scripts 2 dependents

ludvigolsen

cvms:Cross-Validation for Model Selection

Cross-validate one or multiple regression and classification models and get relevant evaluation metrics in a tidy format. Validate the best model on a test set and compare it to a baseline evaluation. Alternatively, evaluate predictions from an external model. Currently supports regression and classification (binary and multiclass). Described in chp. 5 of Jeyaraman, B. P., Olsen, L. R., & Wambugu M. (2019, ISBN: 9781838550134).

Maintained by Ludvig Renbo Olsen. Last updated 25 days ago.

39 stars 10.31 score 492 scripts 5 dependents

bioc

pRoloc:A unifying bioinformatics framework for spatial proteomics

The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.

Maintained by Lisa Breckels. Last updated 5 days ago.

immunooncology proteomics massspectrometry classification clustering qualitycontrol bioconductor proteomics-data spatial-proteomics visualisation openblas cpp

15 stars 10.31 score 101 scripts 2 dependents

facebookexperimental

Robyn:Semi-Automated Marketing Mix Modeling (MMM) from Meta Marketing Science

Semi-Automated Marketing Mix Modeling (MMM) aiming to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocation and diminishing returns curves and allows ground-truth calibration to account for causation.

Maintained by Gufeng Zhou. Last updated 13 days ago.

adstocking budget-allocation cost-response-curve econometrics evolutionary-algorithm gradient-based-optimisation hyperparameter-optimization marketing-mix-modeling marketing-mix-modelling marketing-science mmm ridge-regression

1.3k stars 10.27 score 95 scripts

ropensci

qualtRics:Download 'Qualtrics' Survey Data

Provides functions to access survey results directly into R using the 'Qualtrics' API. 'Qualtrics' <https://www.qualtrics.com/about/> is an online survey and data collection software platform. See <https://api.qualtrics.com/> for more information about the 'Qualtrics' API. This package is community-maintained and is not officially supported by 'Qualtrics'.

Maintained by Julia Silge. Last updated 7 months ago.

api qualtrics qualtrics-api survey survey-data

221 stars 10.23 score 272 scripts

idigbio

ridigbio:Interface to the iDigBio Data API

An interface to iDigBio's search API that allows downloading specimen records. Searches are returned as a data.frame. Other functions such as the metadata end points return lists of information. iDigBio is a US project focused on digitizing and serving museum specimen collections on the web. See <https://www.idigbio.org> for information on iDigBio.

Maintained by Jesse Bennett. Last updated 20 days ago.

16 stars 10.23 score 63 scripts 7 dependents

cboettig

knitcitations:Citations for 'Knitr' Markdown Files

Provides the ability to create dynamic citations in which the bibliographic information is pulled from the web rather than having to be entered into a local database such as 'bibtex' ahead of time. The package is primarily aimed at authoring in the R 'markdown' format, and can provide outputs for web-based authoring such as linked text for inline citations. Cite using a 'DOI', URL, or 'bibtex' file key. See the package URL for details.

Maintained by Carl Boettiger. Last updated 4 years ago.

220 stars 10.14 score 836 scripts 2 dependents

dslc-io

tidytuesdayR:Access the Weekly 'TidyTuesday' Project Dataset

'TidyTuesday' is a project by the 'Data Science Learning Community' in which they post a weekly dataset in a public data repository (<https://github.com/rfordatascience/tidytuesday>) for people to analyze and visualize. This package provides the tools to easily download this data and the description of the source.

Maintained by Jon Harmon. Last updated 6 days ago.

77 stars 10.13 score 3.0k scripts

nflverse

nflfastR:Functions to Efficiently Access NFL Play by Play Data

A set of functions to access National Football League play-by-play data from <https://www.nfl.com/>.

Maintained by Ben Baldwin. Last updated 15 hours ago.

american-football football-data nfl nflstats nflverse sports-analytics

443 stars 10.11 score 596 scripts 3 dependents

bleutner

RStoolbox:Remote Sensing Data Analysis

Toolbox for remote sensing image processing and analysis such as calculating spectral indexes, principal component transformation, unsupervised and supervised classification or fractional cover analyses.

Maintained by Konstantin Mueller. Last updated 2 months ago.

ggplot2 land-cover-mapping remote-sensing spectral-unmixing supervised-classification unsupervised-classification openblas cpp

275 stars 10.10 score 1.1k scripts

ropensci

spocc:Interface to Species Occurrence Data Sources

A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility ('GBIF'), 'iNaturalist', 'eBird', Integrated Digitized 'Biocollections' ('iDigBio'), 'VertNet', Ocean 'Biogeographic' Information System ('OBIS'), and Atlas of Living Australia ('ALA'). Includes functionality for retrieving species occurrence data, and combining those data.

Maintained by Hannah Owens. Last updated 2 months ago.

specimens api web-services occurrences species taxonomy gbif inat vertnet ebird idigbio obis ala antweb bison data ecoengine inaturalist occurrence species-occurrence spocc

118 stars 10.09 score 552 scripts 5 dependents

gshs-ornl

wbstats:Programmatic Access to Data and Statistics from the World Bank API

Search and download data from the World Bank Data API.

Maintained by Jesse Piburn. Last updated 4 years ago.

open-data world-bank world-bank-api worldbank

126 stars 10.07 score 1.1k scripts 3 dependents

pecanproject

PEcAn.settings:PEcAn Settings package

Contains functions to read PEcAn settings files.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 10.02 score 54 scripts 17 dependents

ropensci

nasapower:NASA POWER API Client

An API client for NASA POWER global meteorology, surface solar energy and climatology data API. POWER (Prediction Of Worldwide Energy Resources) data are freely available for download with varying spatial resolutions dependent on the original data and with several temporal resolutions depending on the POWER parameter and community. This work is funded through the NASA Earth Science Directorate Applied Science Program. For more on the data themselves, the methodologies used in creating, a web- based data viewer and web access, please see <https://power.larc.nasa.gov/>.

Maintained by Adam H. Sparks. Last updated 26 days ago.

nasa meteorological-data weather global weather-data meteorology nasa-power agroclimatology earth-science data-access climate-data agroclimatology-data weather-variables

101 stars 9.98 score 137 scripts 3 dependents

pecanproject

PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Istem Fer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.96 score 20 scripts 2 dependents

pecanproject

PEcAn.priors:PEcAn Functions Used to Estimate Priors from Data

Functions to estimate priors from data.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.95 score 13 scripts 6 dependents

tlverse

sl3:Pipelines for Machine Learning and Super Learning

A modern implementation of the Super Learner prediction algorithm, coupled with a general purpose framework for composing arbitrary pipelines for machine learning tasks.

Maintained by Jeremy Coyle. Last updated 5 months ago.

data-science ensemble-learning ensemble-model machine-learning model-selection regression stacking statistics

100 stars 9.94 score 748 scripts 7 dependents

darwin-eu

CodelistGenerator:Identify Relevant Clinical Codes and Evaluate Their Use

Generate a candidate code list for the Observational Medical Outcomes Partnership (OMOP) common data model based on string matching. For a given search strategy, a candidate code list will be returned.

Maintained by Edward Burn. Last updated 5 days ago.

14 stars 9.94 score 165 scripts 4 dependents

laresbernardo

lares:Analytics & Machine Learning Sidekick

Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.

Maintained by Bernardo Lares. Last updated 1 months ago.

analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle rlanguage robyn visualization

233 stars 9.92 score 185 scripts 1 dependents

pecanproject

PEcAn.MA:PEcAn Functions Used for Meta-Analysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.MA package contains the functions used in the Bayesian meta-analysis of trait data.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.91 score 7 scripts 7 dependents

bioc

OmnipathR:OmniPath web service client and more

A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).

Maintained by Denes Turei. Last updated 1 months ago.

graphandnetwork network pathways software thirdpartyclient dataimport datarepresentation genesignaling generegulation systemsbiology transcriptomics singlecell annotation kegg complexes enzyme-ptm networks networks-biology omnipath proteins quarto

130 stars 9.90 score 226 scripts 2 dependents

jaseziv

worldfootballR:Extract and Clean World Football (Soccer) Data

Allow users to obtain clean and tidy football (soccer) game, team and player data. Data is collected from a number of popular sites, including 'FBref', transfer and valuations data from 'Transfermarkt'<https://www.transfermarkt.com/> and shooting location and other match stats data from 'Understat'<https://understat.com/>. It gives users the ability to access data more efficiently, rather than having to export data tables to files before being able to complete their analysis.

Maintained by Jason Zivkovic. Last updated 1 days ago.

fbref football football-data soccer-data sports-data transfermarkt understat

509 stars 9.88 score 516 scripts 2 dependents

rstudio

distill:'R Markdown' Format for Scientific and Technical Writing

Scientific and technical article format for the web. 'Distill' articles feature attractive, reader-friendly typography, flexible layout options for visualizations, and full support for footnotes and citations.

Maintained by Christophe Dervieux. Last updated 1 years ago.

423 stars 9.85 score 402 scripts 6 dependents

cardiomoon

moonBook:Functions and Datasets for the Book by Keon-Woong Moon

Several analysis-related functions for the book entitled "R statistics and graph for medical articles" (written in Korean), version 1, by Keon-Woong Moon with Korean demographic data with several plot functions.

Maintained by Keon-Woong Moon. Last updated 1 years ago.

38 stars 9.79 score 278 scripts 6 dependents

insightsengineering

teal.modules.general:General Modules for 'teal' Applications

Prebuilt 'shiny' modules containing tools for viewing data, visualizing data, understanding missing and outlier values within your data and performing simple data analysis. This extends 'teal' framework that supports reproducible research and analysis.

Maintained by Dawid Kaledkowski. Last updated 1 months ago.

general-purpose modules nest shiny

13 stars 9.74 score 71 scripts

epiforecasts

socialmixr:Social Mixing Matrices for Infectious Disease Modelling

Provides methods for sampling contact matrices from diary data for use in infectious disease modelling, as discussed in Mossong et al. (2008) <doi:10.1371/journal.pmed.0050074>.

Maintained by Sebastian Funk. Last updated 6 months ago.

38 stars 9.74 score 227 scripts 1 dependents

prestodb

RPresto:DBI Connector to Presto

Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: <https://prestodb.io/>.

Maintained by Jarod G.R. Meng. Last updated 2 months ago.

132 stars 9.73 score 25 scripts 4 dependents

pecanproject

PEcAnRTM:PEcAn Functions Used for Radiative Transfer Modeling

Functions for performing forward runs and inversions of radiative transfer models (RTMs). Inversions can be performed using maximum likelihood, or more complex hierarchical Bayesian methods. Underlying numerical analyses are optimized for speed using Fortran code.

Maintained by Alexey Shiklomanov. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants fortran jags cpp

216 stars 9.70 score 132 scripts

appsilon

shiny.telemetry:'Shiny' App Usage Telemetry

Enables instrumentation of 'Shiny' apps for tracking user session events such as input changes, browser type, and session duration. These events can be sent to any of the available storage backends and analyzed using the included 'Shiny' app to gain insights about app usage and adoption.

Maintained by André Veríssimo. Last updated 4 months ago.

analytics rhinoverse shiny

67 stars 9.69 score 29 scripts

bblonder

hypervolume:High Dimensional Geometry, Set Operations, Projection, and Inference Using Kernel Density Estimation, Support Vector Machines, and Convex Hulls

Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.

Maintained by Benjamin Blonder. Last updated 2 months ago.

openblas cpp

23 stars 9.69 score 211 scripts 7 dependents

fmmattioni

downloadthis:Implement Download Buttons in 'rmarkdown'

Implement download buttons in HTML output from 'rmarkdown' without the need for 'runtime:shiny'.

Maintained by Felipe Mattioni Maturana. Last updated 6 months ago.

146 stars 9.63 score 856 scripts 1 dependents

ropensci

tidyhydat:Extract and Tidy Canadian 'Hydrometric' Data

Provides functions to access historical and real-time national 'hydrometric' data from Water Survey of Canada data sources (<https://dd.weather.gc.ca/hydrometric/csv/> and <https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>) and then applies tidy data principles.

Maintained by Sam Albers. Last updated 20 days ago.

citz government-data hydrology hydrometrics tidy-data water-resources

71 stars 9.59 score 202 scripts 3 dependents

business-science

anomalize:Tidy Anomaly Detection

The 'anomalize' package enables a "tidy" workflow for detecting anomalies in data. The main functions are time_decompose(), anomalize(), and time_recompose(). When combined, it's quite simple to decompose time series, detect anomalies, and create bands separating the "normal" data from the anomalous data at scale (i.e. for multiple time series). Time series decomposition is used to remove trend and seasonal components via the time_decompose() function and methods include seasonal decomposition of time series by Loess ("stl") and seasonal decomposition by piecewise medians ("twitter"). The anomalize() function implements two methods for anomaly detection of residuals including using an inner quartile range ("iqr") and generalized extreme studentized deviation ("gesd"). These methods are based on those used in the 'forecast' package and the Twitter 'AnomalyDetection' package. Refer to the associated functions for specific references for these methods.

Maintained by Matt Dancho. Last updated 1 years ago.

anomaly anomaly-detection decomposition detect-anomalies iqr time-series

339 stars 9.56 score 332 scripts

ndphillips

FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees

Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.

Maintained by Hansjoerg Neth. Last updated 6 months ago.

136 stars 9.53 score 144 scripts

e-sensing

sits:Satellite Image Time Series Analysis for Earth Observation Data Cubes

An end-to-end toolkit for land use and land cover classification using big Earth observation data, based on machine learning methods applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>. Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, Brazil Data Cube, Copernicus Data Space Environment (CDSE), Digital Earth Africa, Digital Earth Australia, NASA HLS using the Spatio-temporal Asset Catalog (STAC) protocol (<https://stacspec.org/>) and the 'gdalcubes' R package developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>. Supports visualization methods for images and time series and smoothing filters for dealing with noisy time series. Includes functions for quality assessment of training samples using self-organized maps as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. Includes methods to reduce training samples imbalance proposed by Chawla et al (2002) <doi:10.1613/jair.953>. Provides machine learning methods including support vector machines, random forests, extreme gradient boosting, multi-layer perceptrons, temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, and temporal attention encoders by Garnot and Landrieu (2020) <doi:10.48550/arXiv.2007.00586>. Supports GPU processing of deep learning models using torch <https://torch.mlverse.org/>. Performs efficient classification of big Earth observation data cubes and includes functions for post-classification smoothing based on Bayesian inference as described by Camara et al (2024) <doi:10.3390/rs16234572>, and methods for active learning and uncertainty assessment. Supports region-based time series analysis using package supercells <https://jakubnowosad.com/supercells/>. Enables best practices for estimating area and assessing accuracy of land change as recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>. Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.

Maintained by Gilberto Camara. Last updated 2 months ago.

big-earth-data cbers earth-observation eo-datacubes geospatial image-time-series land-cover-classification landsat planetary-computer r-spatial remote-sensing rspatial satellite-image-time-series satellite-imagery sentinel-2 stac-api stac-catalog cpp

494 stars 9.50 score 384 scripts

datastorm-open

suncalc:Compute Sun Position, Sunlight Phases, Moon Position and Lunar Phase

Get sun position, sunlight phases (times for sunrise, sunset, dusk, etc.), moon position and lunar phase for the given location and time. Most calculations are based on the formulas given in Astronomy Answers articles about position of the sun and the planets : <https://www.aa.quae.nl/en/reken/zonpositie.html>.

Maintained by Benoit Thieurmel. Last updated 1 years ago.

44 stars 9.43 score 372 scripts 16 dependents

sizespectrum

mizer:Dynamic Multi-Species Size Spectrum Modelling

A set of classes and methods to set up and run multi-species, trait based and community size spectrum ecological models, focused on the marine environment.

Maintained by Gustav Delius. Last updated 2 months ago.

ecosystem-model fish-population-dynamics fisheries fisheries-management marine-ecosystem population-dynamics simulation size-structure species-interactions transport-equation cpp

39 stars 9.41 score 207 scripts

ropensci

rnoaa:'NOAA' Weather Data from R

Client for many 'NOAA' data sources including the 'NCDC' climate 'API' at <https://www.ncdc.noaa.gov/cdo-web/webservices/v2>, with functions for each of the 'API' 'endpoints': data, data categories, data sets, data types, locations, location categories, and stations. In addition, we have an interface for 'NOAA' sea ice data, the 'NOAA' severe weather inventory, 'NOAA' Historical Observing 'Metadata' Repository ('HOMR') data, 'NOAA' storm data via 'IBTrACS', tornado data via the 'NOAA' storm prediction center, and more.

Maintained by Daniel Hocking. Last updated 2 months ago.

earth science climate precipitation temperature storm buoy ncdc noaa tornadoe sea ice isd noaa-data

334 stars 9.39 score 788 scripts 4 dependents

usepa

tcpl:ToxCast Data Analysis Pipeline

The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.

Maintained by Jason Brown. Last updated 12 days ago.

ccte comptox ord

36 stars 9.39 score 90 scripts

pecanproject

PEcAn.data.land:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Mike Dietze. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.34 score 19 scripts 10 dependents

rte-antares-rpackage

antaresRead:Import, Manipulate and Explore the Results of an 'Antares' Simulation

Import, manipulate and explore results generated by 'Antares', a powerful open source software developed by RTE (Réseau de Transport d’Électricité) to simulate and study electric power systems (more information about 'Antares' here : <https://antares-simulator.org/>).

Maintained by Tatiana Vargas. Last updated 6 days ago.

infrastructure dataimport adequacy bilan electricity energy hdf5 linear-algebra monte-carlo-simulation optimisation previsionnel rhdf5 rte simulation tyndp

13 stars 9.32 score 148 scripts 3 dependents

microsoft

finnts:Microsoft Finance Time Series Forecasting Framework

Automated time series forecasting developed by Microsoft Finance. The Microsoft Finance Time Series Forecasting Framework, aka Finn, can be used to forecast any component of the income statement, balance sheet, or any other area of interest by finance. Any numerical quantity over time, Finn can be used to forecast it. While it can be applied outside of the finance domain, Finn was built to meet the needs of financial analysts to better forecast their businesses within a company, and has a lot of built in features that are specific to the needs of financial forecasters. Happy forecasting!

Maintained by Mike Tokic. Last updated 1 months ago.

business data-science feature-selection finance finnts forecasting machine-learning microsoft time-series

194 stars 9.30 score 39 scripts

sbegueria

SPEI:Calculation of the Standardized Precipitation-Evapotranspiration Index

A set of functions for computing potential evapotranspiration and several widely used drought indices including the Standardized Precipitation-Evapotranspiration Index (SPEI).

Maintained by Santiago Beguería. Last updated 2 years ago.

82 stars 9.27 score 314 scripts 20 dependents

stevenmmortimer

salesforcer:An Implementation of 'Salesforce' APIs Using Tidy Principles

Functions connecting to the 'Salesforce' Platform APIs (REST, SOAP, Bulk 1.0, Bulk 2.0, Metadata, Reports and Dashboards) <https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_overview>. "API" is an acronym for "application programming interface". Most all calls from these APIs are supported as they use CSV, XML or JSON data that can be parsed into R data structures. For more details please see the 'Salesforce' API documentation and this package's website <https://stevenmmortimer.github.io/salesforcer/> for more information, documentation, and examples.

Maintained by Steven M. Mortimer. Last updated 5 months ago.

api-wrappers r-language r-programming salesforce salesforce-apis

82 stars 9.27 score 191 scripts

business-science

sweep:Tidy Tools for Forecasting

Tidies up the forecasting modeling and prediction work flow, extends the 'broom' package with 'sw_tidy', 'sw_glance', 'sw_augment', and 'sw_tidy_decomp' functions for various forecasting models, and enables converting 'forecast' objects to "tidy" data frames with 'sw_sweep'.

Maintained by Matt Dancho. Last updated 1 years ago.

broom forecast forecasting-models prediction tidy tidyverse time time-series timeseries

155 stars 9.23 score 399 scripts 1 dependents

bioc

rWikiPathways:rWikiPathways - R client library for the WikiPathways API

Use this package to interface with the WikiPathways API. It provides programmatic access to WikiPathways content in multiple data and image formats, including official monthly release files and convenient GMT read/write functions.

Maintained by Egon Willighagen. Last updated 5 months ago.

visualization graphandnetwork thirdpartyclient network metabolomics bioinformatics data-access pathways

15 stars 9.23 score 131 scripts 3 dependents

aphalo

photobiology:Photobiological Calculations

Definitions of classes, methods, operators and functions for use in photobiology and radiation meteorology and climatology. Calculation of effective (weighted) and not-weighted irradiances/doses, fluence rates, transmittance, reflectance, absorptance, absorbance and diverse ratios and other derived quantities from spectral data. Local maxima and minima: peaks, valleys and spikes. Conversion between energy-and photon-based units. Wavelength interpolation. Astronomical calculations related solar angles and day length. Colours and vision. This package is part of the 'r4photobiology' suite, Aphalo, P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 2 days ago.

light photobiology quantification r4photobiology-suite radiation spectra sun-position

4 stars 9.20 score 604 scripts 12 dependents

whipson

maestro:Orchestration of Data Pipelines

Framework for creating and orchestrating data pipelines. Organize, orchestrate, and monitor multiple pipelines in a single project. Use tags to decorate functions with scheduling parameters and configuration.

Maintained by Will Hipson. Last updated 7 days ago.

119 stars 9.20 score 150 scripts

tidymodels

embed:Extra Recipes for Encoding Predictors

Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.

Maintained by Emil Hvitfeldt. Last updated 2 months ago.

142 stars 9.18 score 1.1k scripts

bodkan

slendr:A Simulation Framework for Spatiotemporal Population Genetics

A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.

Maintained by Martin Petr. Last updated 3 days ago.

popgen population-genetics simulations spatial-statistics

56 stars 9.13 score 88 scripts

pecanproject

PEcAn.allometry:PEcAn Allometry Functions

Synthesize allometric equations or fit allometries to data.

Maintained by Mike Dietze. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 9.12 score 34 scripts

malaria-atlas-project

malariaAtlas:An R Interface to Open-Access Malaria Data, Hosted by the 'Malaria Atlas Project'

A suite of tools to allow you to download all publicly available parasite rate survey points, mosquito occurrence points and raster surfaces from the 'Malaria Atlas Project' <https://malariaatlas.org/> servers as well as utility functions for plotting the downloaded data.

Maintained by Mauricio van den Berg. Last updated 9 months ago.

database malaria opendata raster

44 stars 9.10 score 118 scripts 3 dependents

huizezhang-sherry

cubble:A Vector Spatio-Temporal Data Structure for Data Analysis

A spatiotemperal data object in a relational data structure to separate the recording of time variant/ invariant variables. See the Journal of Statistical Software reference: <doi:10.18637/jss.v110.i07>.

Maintained by H. Sherry Zhang. Last updated 6 months ago.

57 stars 9.07 score 83 scripts

pecanproject

PEcAn.qaqc:QAQC

PEcAn integration and model skill testing

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 9.06 score 5 scripts

bioc

BatchQC:Batch Effects Quality Control Software

Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.

Maintained by Jessica Anderson. Last updated 13 days ago.

batcheffect graphandnetwork microarray normalization principalcomponent sequencing software visualization qualitycontrol rnaseq preprocessing differentialexpression immunooncology

7 stars 9.06 score 54 scripts

bupaverse

bupaR:Business Process Analysis in R

Comprehensive Business Process Analysis toolkit. Creates S3-class for event log objects, and related handler functions. Imports related packages for filtering event data, computation of descriptive statistics, handling of 'Petri Net' objects and visualization of process maps. See also packages 'edeaR','processmapR', 'eventdataR' and 'processmonitR'.

Maintained by Gert Janssenswillen. Last updated 2 years ago.

57 stars 9.06 score 389 scripts 11 dependents

mlverse

tabnet:Fit 'TabNet' Models for Classification and Regression

Implements the 'TabNet' model by Sercan O. Arik et al. (2019) <doi:10.48550/arXiv.1908.07442> with 'Coherent Hierarchical Multi-label Classification Networks' by Giunchiglia et al. <doi:10.48550/arXiv.2010.10151> and provides a consistent interface for fitting and creating predictions. It's also fully compatible with the 'tidymodels' ecosystem.

Maintained by Christophe Regouby. Last updated 1 days ago.

tabnet

109 stars 9.05 score 65 scripts

ropensci

ijtiff:Comprehensive TIFF I/O with Full Support for 'ImageJ' TIFF Files

General purpose TIFF file I/O for R users. Currently the only such package with read and write support for TIFF files with floating point (real-numbered) pixels, and the only package that can correctly import TIFF files that were saved from 'ImageJ' and write TIFF files than can be correctly read by 'ImageJ' <https://imagej.net/ij/>. Also supports text image I/O.

Maintained by Rory Nolan. Last updated 9 days ago.

image-manipulation imagej peer-reviewed tiff-files tiff-images tiff

18 stars 9.03 score 36 scripts 7 dependents

pecanproject

PEcAn.all:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 9.01 score 266 scripts

bupaverse

edeaR:Exploratory and Descriptive Event-Based Data Analysis

Exploratory and descriptive analysis of event based data. Provides methods for describing and selecting process data, and for preparing event log data for process mining. Builds on the S3-class for event logs implemented in the package 'bupaR'.

Maintained by Gert Janssenswillen. Last updated 4 months ago.

12 stars 9.01 score 149 scripts 8 dependents

irinagain

iglu:Interpreting Glucose Data from Continuous Glucose Monitors

Implements a wide range of metrics for measuring glucose control and glucose variability based on continuous glucose monitoring data. The list of implemented metrics is summarized in Rodbard (2009) <doi:10.1089/dia.2009.0015>. Additional visualization tools include time-series plots, lasagna plots and ambulatory glucose profile report.

Maintained by Irina Gaynanova. Last updated 26 days ago.

26 stars 9.00 score 39 scripts

ramikrispin

TSstudio:Functions for Time Series Analysis and Forecasting

Provides a set of tools for descriptive and predictive analysis of time series data. That includes functions for interactive visualization of time series objects and as well utility functions for automation time series forecasting.

Maintained by Rami Krispin. Last updated 2 years ago.

forecasting time-series timeseries tsstudio visualization

424 stars 9.00 score 656 scripts

billpetti

baseballr:Acquiring and Analyzing Baseball Data

Provides numerous utilities for acquiring and analyzing baseball data from online sources such as 'Baseball Reference' <https://www.baseball-reference.com/>, 'FanGraphs' <https://www.fangraphs.com/>, and the 'MLB Stats' API <https://www.mlb.com/>.

Maintained by Saiem Gilani. Last updated 5 months ago.

baseball pitchfx sabermetrics statcast

380 stars 8.98 score 582 scripts

pecanproject

PEcAn.MAAT:PEcAn Package for Integration of the MAAT Model

This module provides functions to wrap the MAAT model into the PEcAn workflows.

Maintained by Shawn Serbin. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 8.96 score 12 scripts

pecanproject

PEcAn.BIOCRO:PEcAn Package for Integration of the BioCro Model

This module provides functions to link BioCro to PEcAn.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.95 score 23 scripts

cjbarrie

academictwitteR:Access the Twitter Academic Research Product Track V2 API Endpoint

Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side.

Maintained by Christopher Barrie. Last updated 2 years ago.

twitter twitter-api

275 stars 8.94 score 177 scripts

pecanproject

PEcAn.uncertainty:PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.94 score 15 scripts 5 dependents

pik-piam

remind2:The REMIND R package (2nd generation)

Contains the REMIND-specific routines for data and model output manipulation.

Maintained by Renato Rodrigues. Last updated 15 hours ago.

8.89 score 161 scripts 5 dependents

pedrohcgs

DRDID:Doubly Robust Difference-in-Differences Estimators

Implements the locally efficient doubly robust difference-in-differences (DiD) estimators for the average treatment effect proposed by Sant'Anna and Zhao (2020) <doi:10.1016/j.jeconom.2020.06.003>. The estimator combines inverse probability weighting and outcome regression estimators (also implemented in the package) to form estimators with more attractive statistical properties. Two different estimation methods can be used to estimate the nuisance functions.

Maintained by Pedro H. C. SantAnna. Last updated 6 months ago.

cpp

92 stars 8.88 score 133 scripts 5 dependents

usdaforestservice

FIESTA:Forest Inventory Estimation and Analysis

A research estimation tool for analysts that work with sample-based inventory data from the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program.

Maintained by Grayson White. Last updated 5 days ago.

30 stars 8.84 score 62 scripts

pecanproject

PEcAn.workflow:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides workhorse functions that can be used to run the major steps of a PEcAn analysis.

Maintained by David LeBauer. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.84 score 15 scripts 4 dependents

evolecolgroup

tidysdm:Species Distribution Models with Tidymodels

Fit species distribution models (SDMs) using the 'tidymodels' framework, which provides a standardised interface to define models and process their outputs. 'tidysdm' expands 'tidymodels' by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2023) <doi:10.1101/2023.07.24.550358>.

Maintained by Andrea Manica. Last updated 25 days ago.

species-distribution-modelling tidymodels

31 stars 8.82 score 51 scripts

pbiecek

archivist:Tools for Storing, Restoring and Searching for R Objects

Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.

Maintained by Przemyslaw Biecek. Last updated 8 months ago.

74 stars 8.81 score 105 scripts 2 dependents

rjournal

rjtools:Preparing, Checking, and Submitting Articles to the 'R Journal'

Create an 'R Journal' 'Rmarkdown' template article, that will generate html and pdf versions of your paper. Check that the paper folder has all the required components needed for submission. Examples of 'R Journal' publications can be found at <https://journal.r-project.org>.

Maintained by Di Cook. Last updated 2 months ago.

33 stars 8.81 score 37 scripts 1 dependents

ijlyttle

bsplus:Adds Functionality to the R Markdown + Shiny Bootstrap Framework

The Bootstrap framework lets you add some JavaScript functionality to your web site by adding attributes to your HTML tags - Bootstrap takes care of the JavaScript <https://getbootstrap.com/docs/3.3/javascript/>. If you are using R Markdown or Shiny, you can use these functions to create collapsible sections, accordion panels, modals, tooltips, popovers, and an accordion sidebar framework (not described at Bootstrap site). Please note this package was designed for Bootstrap 3.3.

Maintained by Ian Lyttle. Last updated 2 years ago.

bootstrap3 rmarkdown shiny

147 stars 8.80 score 295 scripts 15 dependents

pecanproject

PEcAn.data.remote:PEcAn Functions Used for Extracting Remote Sensing Data

PEcAn module for processing remote data. Python module requirements: requests, json, re, ast, panads, sys. If any of these modules are missing, install using pip install <module name>.

Maintained by Bailey Morrison. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 8.76 score 6 scripts 5 dependents

pecanproject

PEcAn.ED2:PEcAn Package for Integration of ED2 Model

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides functions to link the Ecosystem Demography Model, version 2, to PEcAn.

Maintained by Mike Dietze. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.74 score 145 scripts

adokter

bioRad:Biological Analysis and Visualization of Weather Radar Data

Extract, visualize and summarize aerial movements of birds and insects from weather radar data. See Dokter, A. M. et al. (2018) "bioRad: biological analysis and visualization of weather radar data" <doi:10.1111/ecog.04028> for a software paper describing package and methodologies.

Maintained by Adriaan M. Dokter. Last updated 4 hours ago.

aeroecology enram eumetnet-opera lifewatch movement-ecology nexrad oscibio radar weather-radar wsr-88d

29 stars 8.73 score 56 scripts

njtierney

brolgar:Browse Over Longitudinal Data Graphically and Analytically in R

Provides a framework of tools to summarise, visualise, and explore longitudinal data. It builds upon the tidy time series data frames used in the 'tsibble' package, and is designed to integrate within the 'tidyverse', and 'tidyverts' (for time series) ecosystems. The methods implemented include calculating features for understanding longitudinal data, including calculating summary statistics such as quantiles, medians, and numeric ranges, sampling individual series, identifying individual series representative of a group, and extending the facet system in 'ggplot2' to facilitate exploration of samples of data. These methods are fully described in the paper "brolgar: An R package to Browse Over Longitudinal Data Graphically and Analytically in R", Nicholas Tierney, Dianne Cook, Tania Prvan (2020) <doi:10.32614/RJ-2022-023>.

Maintained by Nicholas Tierney. Last updated 3 months ago.

109 stars 8.73 score 141 scripts

bioc

CellBench:Construct Benchmarks for Single Cell Analysis Methods

This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.

Maintained by Shian Su. Last updated 5 months ago.

software infrastructure singlecell benchmark bioinformatics

31 stars 8.73 score 98 scripts

ropensci

comtradr:Interface with the United Nations Comtrade API

Interface with and extract data from the United Nations 'Comtrade' API <https://comtradeplus.un.org/>. 'Comtrade' provides country level shipping data for a variety of commodities, these functions allow for easy API query and data returned as a tidy data frame.

Maintained by Paul Bochtler. Last updated 5 months ago.

api comtrade peer-reviewed supply-chain

66 stars 8.67 score 70 scripts

pharmaverse

admiralonco:Oncology Extension Package for ADaM in 'R' Asset Library

Programming oncology specific Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in 'R'. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team (2021), <https://www.cdisc.org/standards/foundational/adam>). The package is an extension package of the 'admiral' package.

Maintained by Stefan Bundfuss. Last updated 2 months ago.

32 stars 8.66 score 30 scripts

rte-antares-rpackage

antaresEditObject:Edit an 'Antares' Simulation

Edit an 'Antares' simulation before running it : create new areas, links, thermal clusters or binding constraints or edit existing ones. Update 'Antares' general & optimization settings. 'Antares' is an open source power system generator, more information available here : <https://antares-simulator.org/>.

Maintained by Tatiana Vargas. Last updated 1 months ago.

antares-simulation cluster energy monte-carlo-simulation rte

8 stars 8.66 score 101 scripts

open-eo

openeo:Client Interface for 'openEO' Servers

Access data and processing functionalities of 'openEO' compliant back-ends in R.

Maintained by Florian Lahn. Last updated 2 months ago.

openeo openeo-user

65 stars 8.65 score 128 scripts

jniedballa

camtrapR:Camera Trap Data Management and Preparation of Occupancy and Spatial Capture-Recapture Analyses

Management of and data extraction from camera trap data in wildlife studies. The package provides a workflow for storing and sorting camera trap photos (and videos), tabulates records of species and individuals, and creates detection/non-detection matrices for occupancy and spatial capture-recapture analyses with great flexibility. In addition, it can visualise species activity data and provides simple mapping functions with GIS export.

Maintained by Juergen Niedballa. Last updated 4 months ago.

occupancy-modeling spatial-capture-recapture wildlife

35 stars 8.65 score 178 scripts

aloy

HLMdiag:Diagnostic Tools for Hierarchical (Multilevel) Linear Models

A suite of diagnostic tools for hierarchical (multilevel) linear models. The tools include not only leverage and traditional deletion diagnostics (Cook's distance, covratio, covtrace, and MDFFITS) but also convenience functions and graphics for residual analysis. Models can be fit using either lmer in the 'lme4' package or lme in the 'nlme' package.

Maintained by Adam Loy. Last updated 4 years ago.

openblas cpp

17 stars 8.63 score 170 scripts 7 dependents

projectmosaic

mosaicCalc:R-Language Based Calculus Operations for Teaching

Software to support the introductory *MOSAIC Calculus* textbook <https://www.mosaic-web.org/MOSAIC-Calculus/>), one of many data- and modeling-oriented educational resources developed by Project MOSAIC (<https://www.mosaic-web.org/>). Provides symbolic and numerical differentiation and integration, as well as support for applied linear algebra (for data science), and differential equations/dynamics. Includes grammar-of-graphics-based functions for drawing vector fields, trajectories, etc. The software is suitable for general use, but intended mainly for teaching calculus.

Maintained by Daniel Kaplan. Last updated 1 months ago.

13 stars 8.63 score 546 scripts

charlie86

spotifyr:R Wrapper for the 'Spotify' Web API

An R wrapper for pulling data from the 'Spotify' Web API <https://developer.spotify.com/documentation/web-api/> in bulk, or post items on a 'Spotify' user's playlist.

Maintained by Daniel Antal. Last updated 5 months ago.

music-information-retrieval spotify

375 stars 8.61 score 936 scripts

insightsengineering

random.cdisc.data:Create Random ADaM Datasets

A set of functions to create random Analysis Data Model (ADaM) datasets and cached dataset. ADaM dataset specifications are described by the Clinical Data Interchange Standards Consortium (CDISC) Analysis Data Model Team.

Maintained by Joe Zhu. Last updated 6 months ago.

cdisc dataset

33 stars 8.60 score 52 scripts

robjhyndman

fpp3:Data for "Forecasting: Principles and Practice" (3rd Edition)

All data sets required for the examples and exercises in the book "Forecasting: principles and practice" by Rob J Hyndman and George Athanasopoulos <https://OTexts.com/fpp3/>. All packages required to run the examples are also loaded. Additional data sets not used in the book are also included.

Maintained by Rob Hyndman. Last updated 7 months ago.

data forecasting

142 stars 8.54 score 2.5k scripts

jpquast

protti:Bottom-Up Proteomics and LiP-MS Quality Control and Data Analysis Tools

Useful functions and workflows for proteomics quality control and data analysis of both limited proteolysis-coupled mass spectrometry (LiP-MS) (Feng et. al. (2014) <doi:10.1038/nbt.2999>) and regular bottom-up proteomics experiments. Data generated with search tools such as 'Spectronaut', 'MaxQuant' and 'Proteome Discover' can be easily used due to flexibility of functions.

Maintained by Jan-Philipp Quast. Last updated 5 months ago.

data-analysis lip-ms mass-spectrometry omics protein proteomics systems-biology

63 stars 8.51 score 83 scripts

ropensci

weatherOz:An API Client for Australian Weather and Climate Data Resources

Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.

Maintained by Rodrigo Pires. Last updated 1 months ago.

dpird bom meteorological-data weather-forecast australia weather weather-data meteorology western-australia australia-bureau-of-meteorology western-australia-agriculture australia-agriculture australia-climate australia-weather api-client climate data rainfall weather-api

31 stars 8.47 score 40 scripts

bmcclintock

momentuHMM:Maximum Likelihood Analysis of Animal Movement Behavior Using Multivariate Hidden Markov Models

Extended tools for analyzing telemetry data using generalized hidden Markov models. Features of momentuHMM (pronounced ``momentum'') include data pre-processing and visualization, fitting HMMs to location and auxiliary biotelemetry or environmental data, biased and correlated random walk movement models, discrete- or continuous-time HMMs, continuous- or discrete-space movement models, approximate Langevin diffusion models, hierarchical HMMs, multiple imputation for incorporating location measurement error and missing data, user-specified design matrices and constraints for covariate modelling of parameters, random effects, decoding of the state process, visualization of fitted models, model checking and selection, and simulation. See McClintock and Michelot (2018) <doi:10.1111/2041-210X.12995>.

Maintained by Brett McClintock. Last updated 2 months ago.

openblas cpp

43 stars 8.47 score 162 scripts

samuel-marsh

scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing

Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.

Maintained by Samuel Marsh. Last updated 3 months ago.

customization ggplot2 scrna-seq seurat single-cell single-cell-genomics single-cell-rna-seq visualization

246 stars 8.45 score 1.1k scripts

ropensci

weathercan:Download Weather Data from Environment and Climate Change Canada

Provides means for downloading historical weather data from the Environment and Climate Change Canada website (<https://climate.weather.gc.ca/historical_data/search_historic_data_e.html>). Data can be downloaded from multiple stations and over large date ranges and automatically processed into a single dataset. Tools are also provided to identify stations either by name or proximity to a location.

Maintained by Steffi LaZerte. Last updated 7 days ago.

environment-canada peer-reviewed weather-data weather-downloader

106 stars 8.45 score 189 scripts

tidymodels

tidyposterior:Bayesian Analysis to Compare Models using Resampling Statistics

Bayesian analysis used here to answer the question: "when looking at resampling results, are the differences between models 'real'?" To answer this, a model can be created were the performance statistic is the resampling statistics (e.g. accuracy or RMSE). These values are explained by the model types. In doing this, we can get parameter estimates for each model's affect on performance and make statistical (and practical) comparisons between models. The methods included here are similar to Benavoli et al (2017) <https://jmlr.org/papers/v18/16-305.html>.

Maintained by Max Kuhn. Last updated 6 months ago.

102 stars 8.44 score 273 scripts

tidyverts

tsibbledata:Diverse Datasets for 'tsibble'

Provides diverse datasets in the 'tsibble' data structure. These datasets are useful for learning and demonstrating how tidy temporal data can tidied, visualised, and forecasted.

Maintained by Mitchell OHara-Wild. Last updated 5 months ago.

dataset tsibble

25 stars 8.44 score 740 scripts 2 dependents

davidhodge931

ggblanket:Simplify 'ggplot2' Visualisation

Simplify 'ggplot2' visualisation with 'ggblanket' wrapper functions.

Maintained by David Hodge. Last updated 1 days ago.

data-visualisation data-visualization ggplot ggplot-extension ggplot2 ggplot2-enhancements visualisation visualization

173 stars 8.42 score 45 scripts

davisvaughan

almanac:Tools for Working with Recurrence Rules

Provides tools for defining recurrence rules and recurrence sets. Recurrence rules are a programmatic way to define a recurring event, like the first Monday of December. Multiple recurrence rules can be combined into larger recurrence sets. A full holiday and calendar interface is also provided that can generate holidays within a particular year, can detect if a date is a holiday, can respect holiday observance rules, and allows for custom holidays.

Maintained by Davis Vaughan. Last updated 2 years ago.

calendars holidays recurrence-rules

73 stars 8.40 score 65 scripts 1 dependents

atfutures

calendar:Create, Read, Write, and Work with 'iCalendar' Files, Calendars and Scheduling Data

Provides function to create, read, write, and work with 'iCalendar' files (which typically have '.ics' or '.ical' extensions), and the scheduling data, calendars and timelines of people, organisations and other entities that they represent. 'iCalendar' is an open standard for exchanging calendar and scheduling information between users and computers, described at <https://icalendar.org/>.

Maintained by Robin Lovelace. Last updated 7 months ago.

calendar ical

42 stars 8.39 score 113 scripts 1 dependents

pecanproject

PEcAn.SIPNET:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Mike Dietze. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.37 score 61 scripts

tidymodels

finetune:Additional Functions for Model Tuning

The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <arXiv:1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.

Maintained by Max Kuhn. Last updated 8 months ago.

62 stars 8.36 score 704 scripts 1 dependents

wallaceecomod

wallace:A Modular Platform for Reproducible Modeling of Species Niches and Distributions

The 'shiny' application Wallace is a modular platform for reproducible modeling of species niches and distributions. Wallace guides users through a complete analysis, from the acquisition of species occurrence and environmental data to visualizing model predictions on an interactive map, thus bundling complex workflows into a single, streamlined interface. An extensive vignette, which guides users through most package functionality can be found on the package's GitHub Pages website: <https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html>.

Maintained by Mary E. Blair. Last updated 25 days ago.

openjdk

133 stars 8.36 score 96 scripts

pecanproject

PEcAn.LINKAGES:PEcAn Package for Integration of the LINKAGES Model

This module provides functions to link the (LINKAGES) to PEcAn.

Maintained by Ann Raiho. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.35 score 59 scripts

smouksassi

ggquickeda:Quickly Explore Your Data Using 'ggplot2' and 'table1' Summary Tables

Quickly and easily perform exploratory data analysis by uploading your data as a 'csv' file. Start generating insights using 'ggplot2' plots and 'table1' tables with descriptive stats, all using an easy-to-use point and click 'Shiny' interface.

Maintained by Samer Mouksassi. Last updated 16 days ago.

73 stars 8.34 score 27 scripts

cefet-rj-dal

harbinger:A Unified Time Series Event Detection Framework

By analyzing time series, it is possible to observe significant changes in the behavior of observations that frequently characterize events. Events present themselves as anomalies, change points, or motifs. In the literature, there are several methods for detecting events. However, searching for a suitable time series method is a complex task, especially considering that the nature of events is often unknown. This work presents Harbinger, a framework for integrating and analyzing event detection methods. Harbinger contains several state-of-the-art methods described in Salles et al. (2020) <doi:10.5753/sbbd.2020.13626>.

Maintained by Eduardo Ogasawara. Last updated 4 months ago.

18 stars 8.32 score 216 scripts

business-science

modeltime.ensemble:Ensemble Algorithms for Time Series Forecasting with Modeltime

A 'modeltime' extension that implements time series ensemble forecasting methods including model averaging, weighted averaging, and stacking. These techniques are popular methods to improve forecast accuracy and stability.

Maintained by Matt Dancho. Last updated 9 months ago.

ensemble ensemble-learning forecast forecasting modeltime stacking stacking-ensemble tidymodels time time-series timeseries

77 stars 8.30 score 143 scripts

insightsengineering

chevron:Standard TLGs for Clinical Trials Reporting

Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.

Maintained by Joe Zhu. Last updated 1 months ago.

clinical-trials graphs listings nest reporting tables

14 stars 8.30 score 12 scripts

surveydown-dev

surveydown:Markdown-Based Surveys Using 'Quarto' and 'shiny'

Generate surveys using markdown and R code chunks. Surveys are composed of two files: a survey.qmd 'Quarto' file defining the survey content (pages, questions, etc), and an app.R file defining a 'shiny' app with global settings (libraries, database configuration, etc.) and server configuration options (e.g., conditional skipping / display, etc.). Survey data collected from respondents is stored in a 'PostgreSQL' database. Features include controls for conditional skip logic (skip to a page based on an answer to a question), conditional display logic (display a question based on an answer to a question), a customizable progress bar, and a wide variety of question types, including multiple choice (single choice and multiple choices), select, text, numeric, multiple choice buttons, text area, and dates. Because the surveys render into a 'shiny' app, designers can also leverage the reactive capabilities of 'shiny' to create dynamic and interactive surveys.

Maintained by John Paul Helveston. Last updated 4 days ago.

markdown postgres postgresql quarto shiny shiny-apps shiny-r supabase survey surveys

97 stars 8.29 score 133 scripts

pik-piam

quitte:Bits and pieces of code to use with quitte-style data frames

A collection of functions for easily dealing with quitte-style data frames, doing multi-model comparisons and plots.

Maintained by Falk Benke. Last updated 7 days ago.

8.26 score 184 scripts 35 dependents

radiant-rstats

radiant.data:Data Menu for Radiant: Business Analytics using R and Shiny

The Radiant Data menu includes interfaces for loading, saving, viewing, visualizing, summarizing, transforming, and combining data. It also contains functionality to generate reproducible reports of the analyses conducted in the application.

Maintained by Vincent Nijs. Last updated 5 months ago.

53 stars 8.25 score 146 scripts 6 dependents

openvolley

datavolley:Reading and Analyzing DataVolley Scout Files

Provides functions for parsing and working with volleyball match files in DataVolley format.

Maintained by Ben Raymond. Last updated 2 months ago.

openvolley sports-analytics volleyball

31 stars 8.24 score 94 scripts 11 dependents

r-dbi

DBItest:Testing DBI Backends

A helper that tests DBI back ends for conformity to the interface.

Maintained by Kirill Müller. Last updated 15 days ago.

database testing

24 stars 8.21 score 11 scripts

robjhyndman

demography:Forecasting Mortality, Fertility, Migration and Population Data

Functions for demographic analysis including lifetable calculations; Lee-Carter modelling; functional data analysis of mortality rates, fertility rates, net migration numbers; and stochastic population forecasting.

Maintained by Rob Hyndman. Last updated 4 months ago.

actuarial demography forecasting

74 stars 8.21 score 241 scripts 6 dependents

sticsrpacks

SticsRFiles:Read and Modify 'STICS' Input/Output Files

Manipulating input and output files of the 'STICS' crop model. Files are either 'JavaSTICS' XML files or text files used by the model 'fortran' executable. Most basic functionalities are reading or writing parameter names and values in both XML or text input files, and getting data from output files. Advanced functionalities include XML files generation from XML templates and/or spreadsheets, or text files generation from XML files by using 'xslt' transformation.

Maintained by Patrice Lecharpentier. Last updated 1 months ago.

4 stars 8.21 score 124 scripts

ropensci

FedData:Download Geospatial Data Available from Several Federated Data Sources

Download geospatial data available from several federated data sources (mainly sources maintained by the US Federal government). Currently, the package enables extraction from nine datasets: The National Elevation Dataset digital elevation models (<https://www.usgs.gov/3d-elevation-program> 1 and 1/3 arc-second; USGS); The National Hydrography Dataset (<https://www.usgs.gov/national-hydrography/national-hydrography-dataset>; USGS); The Soil Survey Geographic (SSURGO) database from the National Cooperative Soil Survey (<https://websoilsurvey.sc.egov.usda.gov/>; NCSS), which is led by the Natural Resources Conservation Service (NRCS) under the USDA; the Global Historical Climatology Network (<https://www.ncei.noaa.gov/products/land-based-station/global-historical-climatology-network-daily>; GHCN), coordinated by National Climatic Data Center at NOAA; the Daymet gridded estimates of daily weather parameters for North America, version 4, available from the Oak Ridge National Laboratory's Distributed Active Archive Center (<https://daymet.ornl.gov/>; DAAC); the International Tree Ring Data Bank; the National Land Cover Database (<https://www.mrlc.gov/>; NLCD); the Cropland Data Layer from the National Agricultural Statistics Service (<https://www.nass.usda.gov/Research_and_Science/Cropland/SARS1a.php>; NASS); and the PAD-US dataset of protected area boundaries (<https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-data-overview>; USGS).

Maintained by R. Kyle Bocinsky. Last updated 4 months ago.

peer-reviewed

100 stars 8.20 score 364 scripts

darwin-eu

DrugUtilisation:Summarise Patient-Level Drug Utilisation in Data Mapped to the OMOP Common Data Model

Summarise patient-level drug utilisation cohorts using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. New users and prevalent users cohorts can be generated and their characteristics, indication and drug use summarised.

Maintained by Martí Català. Last updated 2 months ago.

8.20 score 156 scripts 2 dependents

davidsjoberg

hablar:Non-Astonishing Results in R

Simple tools for converting columns to new data types. Intuitive functions for columns with missing values.

Maintained by David Sjoberg. Last updated 2 years ago.

59 stars 8.20 score 468 scripts

nixtla

nixtlar:A Software Development Kit for 'Nixtla''s 'TimeGPT'

A Software Development Kit for working with 'Nixtla''s 'TimeGPT', a foundation model for time series forecasting. 'API' is an acronym for 'application programming interface'; this package allows users to interact with 'TimeGPT' via the 'API'. You can set and validate 'API' keys and generate forecasts via 'API' calls. It is compatible with 'tsibble' and base R. For more details visit <https://docs.nixtla.io/>.

Maintained by Mariana Menchero. Last updated 1 months ago.

32 stars 8.19 score 38 scripts

bioc

POMA:Tools for Omics Data Analysis

The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.

Maintained by Pol Castellano-Escuder. Last updated 4 months ago.

batcheffect classification clustering decisiontree dimensionreduction multidimensionalscaling normalization preprocessing principalcomponent regression rnaseq software statisticalmethod visualization bioconductor bioinformatics data-visualization dimension-reduction exploratory-data-analysis machine-learning omics-data-integration pipeline pre-processing statistical-analysis user-friendly workflow

11 stars 8.16 score 20 scripts 1 dependents

aphalo

ggspectra:Extensions to 'ggplot2' for Radiation Spectra

Additional annotations, stats, geoms and scales for plotting "light" spectra with 'ggplot2', together with specializations of ggplot() and autoplot() methods for spectral data and waveband definitions stored in objects of classes defined in package 'photobiology'. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 2 days ago.

dataviz ggplot2-autoplot ggplot2-enhancementes ggplot2-geoms ggplot2-scales ggplot2-stats light r4photobiology-suite radiation spectra

5 stars 8.15 score 390 scripts 1 dependents

robjhyndman

cricketdata:International Cricket Data

Data on international and other major cricket matches from ESPNCricinfo <https://www.espncricinfo.com> and Cricsheet <https://cricsheet.org>. This package provides some functions to download the data into tibbles ready for analysis.

Maintained by Rob Hyndman. Last updated 7 days ago.

cricket cricket-data ozunconf17 unconf

88 stars 8.14 score 87 scripts

nceas

metajam:Easily Download Data and Metadata from 'DataONE'

A set of tools to foster the development of reproducible analytical workflow by simplifying the download of data and metadata from 'DataONE' (<https://www.dataone.org>) and easily importing this information into R.

Maintained by Julien Brun. Last updated 7 months ago.

data data-analysis metadata repositories

16 stars 8.13 score 75 scripts

pecanproject

PEcAnAssimSequential:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Mike Dietze. Last updated 5 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

216 stars 8.13 score 35 scripts

evolecolgroup

pastclim:Manipulate Time Series of Climate Reconstructions

Methods to easily extract and manipulate climate reconstructions for ecological and anthropological analyses, as described in Leonardi et al. (2023) <doi:10.1111/ecog.06481>. The package includes datasets of palaeoclimate reconstructions, present observations, and future projections from multiple climate models.

Maintained by Andrea Manica. Last updated 19 days ago.

climate-data paleoclimate species-distribution-modelling

38 stars 8.12 score 49 scripts

pik-piam

mip:Comparison of multi-model runs

Package contains generic functions to produce comparison plots of multi-model runs.

Maintained by David Klein. Last updated 15 hours ago.

1 stars 8.11 score 70 scripts 21 dependents

ramiromagno

gwasrapidd:'REST' 'API' Client for the 'NHGRI'-'EBI' 'GWAS' Catalog

'GWAS' R 'API' Data Download. This package provides easy access to the 'NHGRI'-'EBI' 'GWAS' Catalog data by accessing the 'REST' 'API' <https://www.ebi.ac.uk/gwas/rest/docs/api/>.

Maintained by Ramiro Magno. Last updated 1 years ago.

thirdpartyclient biomedicalinformatics genomewideassociation snp association-studies gwas-catalog human rest-client trait trait-ontology

95 stars 8.10 score 49 scripts 1 dependents

tychobra

polished:Authentication and Hosting for 'shiny' Apps

Authentication, user administration, hosting, and additional infrastructure for 'shiny' apps. See <https://polished.tech> for additional documentation and examples.

Maintained by Andy Merlino. Last updated 12 days ago.

233 stars 8.09 score 75 scripts

chop-cgtinformatics

REDCapTidieR:Extract 'REDCap' Databases into Tidy 'Tibble's

Convert 'REDCap' exports into tidy tables for easy handling of 'REDCap' repeat instruments and event arms.

Maintained by Richard Hanna. Last updated 11 days ago.

redcap redcap-api tidy-data

35 stars 8.08 score 36 scripts

luomus

finbif:Interface for the 'Finnish Biodiversity Information Facility' API

A programmatic interface to the 'Finnish Biodiversity Information Facility' ('FinBIF') API (<https://api.laji.fi>). 'FinBIF' aggregates Finnish biodiversity data from multiple sources in a single open access portal for researchers, citizen scientists, industry and government. 'FinBIF' allows users of biodiversity information to find, access, combine and visualise data on Finnish plants, animals and microorganisms. The 'finbif' package makes the publicly available data in 'FinBIF' easily accessible to programmers. Biodiversity information is available on taxonomy and taxon occurrence. Occurrence data can be filtered by taxon, time, location and other variables. The data accessed are conveniently preformatted for subsequent analyses.

Maintained by William K. Morris. Last updated 12 days ago.

api biodiversity biodiversity-informatics biodiversity-information finbif finbif-access occurrences r-programming species specimens taxon taxonomy web-services

5 stars 8.07 score 42 scripts 3 dependents

bayes-rules

bayesrules:Datasets and Supplemental Functions from Bayes Rules! Book

Provides datasets and functions used for analysis and visualizations in the Bayes Rules! book (<https://www.bayesrulesbook.com>). The package contains a set of functions that summarize and plot Bayesian models from some conjugate families and another set of functions for evaluation of some Bayesian models.

Maintained by Mine Dogucu. Last updated 3 years ago.

bayesian-statistics data

72 stars 8.06 score 466 scripts

cran

epiR:Tools for the Analysis of Epidemiological Data

Tools for the analysis of epidemiological and surveillance data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, computation of confidence intervals around incidence risk and incidence rate estimates and sample size calculations for cross-sectional, case-control and cohort studies. Surveillance tools include functions to calculate an appropriate sample size for 1- and 2-stage representative freedom surveys, functions to estimate surveillance system sensitivity and functions to support scenario tree modelling analyses.

Maintained by Mark Stevenson. Last updated 2 months ago.

10 stars 8.06 score 10 dependents

radiant-rstats

radiant:Business Analytics using R and Shiny

A platform-independent browser-based interface for business analytics in R, based on the shiny package. The application combines the functionality of 'radiant.data', 'radiant.design', 'radiant.basics', 'radiant.model', and 'radiant.multivariate'.

Maintained by Vincent Nijs. Last updated 11 months ago.

460 stars 8.02 score 228 scripts

mazamascience

MazamaSpatialUtils:Spatial Data Download and Utility Functions

A suite of conversion functions to create internally standardized spatial polygons data frames. Utility functions use these data sets to return values such as country, state, time zone, watershed, etc. associated with a set of longitude/latitude pairs. (They also make cool maps.)

Maintained by Jonathan Callahan. Last updated 5 months ago.

5 stars 8.01 score 282 scripts 2 dependents

scasanova

f1dataR:Access Formula 1 Data

Obtain Formula 1 data via the 'Jolpica API' <https://jolpi.ca> and the unofficial API <https://www.formula1.com/en/timing/f1-live> via the 'fastf1' 'Python' library <https://docs.fastf1.dev/>.

Maintained by Santiago Casanova. Last updated 5 days ago.

f1 formula1 sports-data

60 stars 7.97 score 26 scripts

ropensci

osmplotr:Bespoke Images of 'OpenStreetMap' Data

Bespoke images of 'OpenStreetMap' ('OSM') data and data visualisation using 'OSM' objects.

Maintained by Mark Padgham. Last updated 1 months ago.

data-visualisation highlighting-clusters openstreetmap osm overpass overpass-api peer-reviewed

139 stars 7.97 score 80 scripts

brian-j-smith

MachineShop:Machine Learning Models and Tools

Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.

Maintained by Brian J Smith. Last updated 7 months ago.

classification-models machine-learning predictive-modeling regression-models survival-models

62 stars 7.95 score 121 scripts

pharmaverse

admiralophtha:ADaM in R Asset Library - Ophthalmology

Aids the programming of Clinical Data Standards Interchange Consortium (CDISC) compliant Ophthalmology Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam/adamig-v1-3-release-package>).

Maintained by Edoardo Mancini. Last updated 3 months ago.

15 stars 7.94 score 10 scripts

bcallaway11

BMisc:Miscellaneous Functions for Panel Data, Quantiles, and Printing Results

These are miscellaneous functions for working with panel data, quantiles, and printing results. For panel data, the package includes functions for making a panel data balanced (that is, dropping missing individuals that have missing observations in any time period), converting id numbers to row numbers, and to treat repeated cross sections as panel data under the assumption of rank invariance. For quantiles, there are functions to make distribution functions from a set of data points (this is particularly useful when a distribution function is created in several steps), to combine distribution functions based on some external weights, and to invert distribution functions. Finally, there are several other miscellaneous functions for obtaining weighted means, weighted distribution functions, and weighted quantiles; to generate summary statistics and their differences for two groups; and to add or drop covariates from formulas.

Maintained by Brantly Callaway. Last updated 2 months ago.

cpp

7 stars 7.92 score 110 scripts 8 dependents

ropenspain

spanishoddata:Get Spanish Origin-Destination Data

Gain seamless access to origin-destination (OD) data from the Spanish Ministry of Transport, hosted at <https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad>. This package simplifies the management of these large datasets by providing tools to download zone boundaries, handle associated origin-destination data, and process it efficiently with the 'duckdb' database interface. Local caching minimizes repeated downloads, streamlining workflows for researchers and analysts. Extensive documentation is available at <https://ropenspain.github.io/spanishoddata/index.html>, offering guides on creating static and dynamic mobility flow visualizations and transforming large datasets into analysis-ready formats.

Maintained by Egor Kotov. Last updated 11 days ago.

cdr data data-package mobile-telephone-data mobility origin-destination

35 stars 7.92 score 14 scripts

tlverse

tmle3:The Extensible TMLE Framework

A general framework supporting the implementation of targeted maximum likelihood estimators (TMLEs) of a diverse range of statistical target parameters through a unified interface. The goal is that the exposed framework be as general as the mathematical framework upon which it draws.

Maintained by Jeremy Coyle. Last updated 5 months ago.

causal-inference machine-learning targeted-learning variable-importance

38 stars 7.91 score 286 scripts 5 dependents

ohdsi

CohortGenerator:Cohort Generation for the OMOP Common Data Model

Generate cohorts and subsets using an Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) Database. Cohorts are defined using 'CIRCE' (<https://github.com/ohdsi/circe-be>) or SQL compatible with 'SqlRender' (<https://github.com/OHDSI/SqlRender>).

Maintained by Anthony Sena. Last updated 6 months ago.

hades openjdk

13 stars 7.91 score 165 scripts

pik-piam

magpie4:MAgPIE outputs R package for MAgPIE version 4.x

Common output routines for extracting results from the MAgPIE framework (versions 4.x).

Maintained by Benjamin Leon Bodirsky. Last updated 20 hours ago.

2 stars 7.90 score 254 scripts 9 dependents

myles-lewis

nestedcv:Nested Cross-Validation with 'glmnet' and 'caret'

Implements nested k*l-fold cross-validation for lasso and elastic-net regularised linear models via the 'glmnet' package and other machine learning models via the 'caret' package <doi:10.1093/bioadv/vbad048>. Cross-validation of 'glmnet' alpha mixing parameter and embedded fast filter functions for feature selection are provided. Described as double cross-validation by Stone (1977) <doi:10.1111/j.2517-6161.1977.tb01603.x>. Also implemented is a method using outer CV to measure unbiased model performance metrics when fitting Bayesian linear and logistic regression shrinkage models using the horseshoe prior over parameters to encourage a sparse model as described by Piironen & Vehtari (2017) <doi:10.1214/17-EJS1337SI>.

Maintained by Myles Lewis. Last updated 12 days ago.

12 stars 7.90 score 46 scripts