R-universe search: edgar

gunratan

edgar:Tool for the U.S. SEC EDGAR Retrieval and Parsing of Corporate Filings

In the USA, companies file different forms with the U.S. Securities and Exchange Commission (SEC) through EDGAR (Electronic Data Gathering, Analysis, and Retrieval system). The EDGAR database automated system collects all the different necessary filings and makes it publicly available. This package facilitates retrieving, storing, searching, and parsing of all the available filings on the EDGAR server. It downloads filings from SEC server in bulk with a single query. Additionally, it provides various useful functions: extracts 8-K triggering events, extract "Business (Item 1)" and "Management's Discussion and Analysis(Item 7)" sections of annual statements, searches filings for desired keywords, provides sentiment measures, parses filing header information, and provides HTML view of SEC filings.

Maintained by Gunratan Lonare. Last updated 9 days ago.

67.7 match 10 stars 2.79 score 61 scripts

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 10 days ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

9.0 match 959 stars 15.16 score 4.0k scripts 21 dependents

ecmerkle

blavaan:Bayesian Latent Variable Analysis

Fit a variety of Bayesian latent variable models, including confirmatory factor analysis, structural equation models, and latent growth curve models. References: Merkle & Rosseel (2018) <doi:10.18637/jss.v085.i04>; Merkle et al. (2021) <doi:10.18637/jss.v100.i06>.

Maintained by Edgar Merkle. Last updated 5 days ago.

bayesian-statistics factor-analysis growth-curve-models latent-variables missing-data multilevel-models multivariate-analysis path-analysis psychometrics statistical-modeling structural-equation-modeling cpp

9.0 match 92 stars 10.84 score 183 scripts 3 dependents

mlverse

chattr:Interact with Large Language Models in 'RStudio'

Enables user interactivity with large-language models ('LLM') inside the 'RStudio' integrated development environment (IDE). The user can interact with the model using the 'shiny' app included in this package, or directly in the 'R' console. It comes with back-ends for 'OpenAI', 'GitHub' 'Copilot', and 'LlamaGPT'.

Maintained by Edgar Ruiz. Last updated 2 months ago.

9.2 match 215 stars 10.55 score 71 scripts 1 dependents

qpsy

nonnest2:Tests of Non-Nested Models

Testing non-nested models via theory supplied by Vuong (1989) <DOI:10.2307/1912557>. Includes tests of model distinguishability and of model fit that can be applied to both nested and non-nested models. Also includes functionality to obtain confidence intervals associated with AIC and BIC. This material is partially based on work supported by the National Science Foundation under Grant Number SES-1061334.

Maintained by Edgar Merkle. Last updated 7 months ago.

9.1 match 6 stars 9.01 score 35 scripts 8 dependents

mwaldstein

edgarWebR:SEC Filings Access

A set of methods to access and parse live filing information from the U.S. Securities and Exchange Commission (SEC - <https://www.sec.gov/>) including company and fund filings along with all associated metadata.

Maintained by Micah J Waldstein. Last updated 4 years ago.

edgar fund-search xbrl

11.0 match 78 stars 6.53 score 43 scripts

mlverse

mall:Run Multiple Large Language Model Predictions Against a Table, or Vectors

Run multiple 'Large Language Model' predictions against a table. The predictions run row-wise over a specified column. It works using a one-shot prompt, along with the current row's content. The prompt that is used will depend of the type of analysis needed.

Maintained by Edgar Ruiz. Last updated 3 months ago.

data-science dplyr llm polars python

9.2 match 86 stars 6.61 score 94 scripts

rstudio

connections:Integrates with the 'RStudio' Connections Pane and 'pins'

Enables 'DBI' compliant packages to integrate with the 'RStudio' connections pane, and the 'pins' package. It automates the display of schemata, tables, views, as well as the preview of the table's top 1000 records.

Maintained by Edgar Ruiz. Last updated 1 years ago.

connection-pane database-connection pins rstudio

9.2 match 57 stars 6.50 score 124 scripts 1 dependents

r-spark

sparklyr.flint:Sparklyr Extension for 'Flint'

This sparklyr extension makes 'Flint' time series library functionalities (<https://github.com/twosigma/flint>) easily accessible through R.

Maintained by Edgar Ruiz. Last updated 3 years ago.

apache-spark data-analysis data-mining data-science distributed distributed-computing flint remote-clusters spark sparklyr statistical-analysis statistics stats summarization summary-statistics time-series time-series-analysis twosigma-flint

9.1 match 9 stars 6.46 score 54 scripts

edgararuiz

dbplot:Simplifies Plotting Data Inside Databases

Leverages 'dplyr' to process the calculations of a plot inside a database. This package provides helper functions that abstract the work at three levels: outputs a 'ggplot', outputs the calculations, outputs the formula needed to calculate bins.

Maintained by Edgar Ruiz. Last updated 5 years ago.

9.4 match 9 stars 5.77 score 186 scripts

indrajeetpatil

ggstatsplot:'ggplot2' Based Plots with Statistical Details

Extension of 'ggplot2', 'ggstatsplot' creates graphics with details from statistical tests included in the plots themselves. It provides an easier syntax to generate information-rich plots for statistical analysis of continuous (violin plots, scatterplots, histograms, dot plots, dot-and-whisker plots) or categorical (pie and bar charts) data. Currently, it supports the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian versions of t-test/ANOVA, correlation analyses, contingency table analysis, meta-analysis, and regression analyses. References: Patil (2021) <doi:10.21105/joss.03236>.

Maintained by Indrajeet Patil. Last updated 20 days ago.

bayes-factors datascience dataviz effect-size ggplot-extension hypothesis-testing non-parametric-statistics regression-models statistical-analysis

3.5 match 2.1k stars 14.49 score 3.0k scripts 1 dependents

edgarsantos-fernandez

SSNbayes:Bayesian Spatio-Temporal Analysis in Stream Networks

Fits Bayesian spatio-temporal models and makes predictions on stream networks using the approach by Santos-Fernandez, Edgar, et al. (2022)."Bayesian spatio-temporal models for stream networks" and Santos-Fernandez, Edgar, et al. (2023). "SSNbayes: An R Package for Bayesian Spatio-Temporal Modelling on Stream Networks". In these models, spatial dependence is captured using stream distance and flow connectivity, while temporal autocorrelation is modelled using vector autoregression methods.

Maintained by Edgar Santos-Fernandez. Last updated 2 months ago.

9.1 match 17 stars 5.41 score 6 scripts

mlverse

pysparklyr:Provides a 'PySpark' Back-End for the 'sparklyr' Package

It enables 'sparklyr' to integrate with 'Spark Connect', and 'Databricks Connect' by providing a wrapper over the 'PySpark' 'python' library.

Maintained by Edgar Ruiz. Last updated 4 days ago.

databricks pyspark spark spark-connect

9.2 match 15 stars 5.33 score 13 scripts

cienciadedatos

datos:Traduce al Español Varios Conjuntos de Datos de Práctica

Provee una versión traducida de los siguientes conjuntos de datos: 'airlines', 'airports', 'AwardsManagers', 'babynames', 'Batting', 'credit_data', 'diamonds', 'faithful', 'fueleconomy', 'Fielding', 'flights', 'gapminder', 'gss_cat', 'iris', 'Managers', 'mpg', 'mtcars', 'atmos', 'palmerpenguins', 'People, 'Pitching', 'planes', 'presidential', 'table1', 'table2', 'table3', 'table4a', 'table4b', 'table5', 'vehicles', 'weather', 'who'. English: It provides a Spanish translated version of the datasets listed above.

Maintained by Riva Quiroga. Last updated 1 years ago.

4.9 match 48 stars 8.12 score 354 scripts

atmoschem

eixport:Export Emissions to Atmospheric Models

Emissions are the mass of pollutants released into the atmosphere. Air quality models need emissions data, with spatial and temporal distribution, to represent air pollutant concentrations. This package, eixport, creates inputs for the air quality models 'WRF-Chem' Grell et al (2005) <doi:10.1016/j.atmosenv.2005.04.027>, 'MUNICH' Kim et al (2018) <doi:10.5194/gmd-11-611-2018> , 'BRAMS-SPM' Freitas et al (2005) <doi:10.1016/j.atmosenv.2005.07.017> and 'RLINE' Snyder et al (2013) <doi:10.1016/j.atmosenv.2013.05.074>. See the 'eixport' website (<https://atmoschem.github.io/eixport/>) for more information, documentations and examples. More details in Ibarra-Espinosa et al (2018) <doi:10.21105/joss.00607>.

Maintained by Sergio Ibarra-Espinosa. Last updated 26 days ago.

atmospheric-models atmospheric-science emissions exporting-emissions wrf

6.3 match 28 stars 6.31 score 97 scripts

indrajeetpatil

statsExpressions:Tidy Dataframes and Expressions with Statistical Details

Utilities for producing dataframes with rich details for the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian t-test, one-way ANOVA, correlation analyses, contingency table analyses, and meta-analyses. The functions are pipe-friendly and provide a consistent syntax to work with tidy data. These dataframes additionally contain expressions with statistical details, and can be used in graphing packages. This package also forms the statistical processing backend for 'ggstatsplot'. References: Patil (2021) <doi:10.21105/joss.03236>.

Maintained by Indrajeet Patil. Last updated 21 days ago.

bayesian-inference bayesian-statistics contingency-table correlation effectsize meta-analysis parametric robust robust-statistics statistical-details statistical-tests tidy

3.5 match 312 stars 10.97 score 146 scripts 2 dependents

r-spark

sparkwarc:Load WARC Files into Apache Spark

Load WARC (Web ARChive) files into Apache Spark using 'sparklyr'. This allows to read files from the Common Crawl project <http://commoncrawl.org/>.

Maintained by Edgar Ruiz. Last updated 3 years ago.

zlib cpp

9.1 match 13 stars 3.89 score 12 scripts

gerardgimenezadsuar

tidyedgar:Tidy Fundamental Financial Data from 'SEC's 'EDGAR' 'API'

Streamline the process of accessing fundamental financial data from the United States Securities and Exchange Commission's ('SEC') Electronic Data Gathering, Analysis, and Retrieval system ('EDGAR') 'API' <https://www.sec.gov/edgar/sec-api-documentation>, transforming it into a tidy, analysis-ready format.

Maintained by Gerard Gimenez-Adsuar. Last updated 1 years ago.

8.9 match 13 stars 3.81 score 3 scripts

tidyverse

dbplyr:A 'dplyr' Back End for Databases

A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.

Maintained by Hadley Wickham. Last updated 3 months ago.

database

1.6 match 481 stars 19.72 score 5.2k scripts 736 dependents

sewardlee337

finreportr:Financial Data from U.S. Securities and Exchange Commission

Download and display company financial data from the U.S. Securities and Exchange Commission's EDGAR database. It contains a suite of functions with web scraping and XBRL parsing capabilities that allows users to extract data from EDGAR in an automated and scalable manner. See <https://www.sec.gov/edgar/searchedgar/companysearch.html> for more information.

Maintained by Seward Lee. Last updated 3 years ago.

balance-sheet cash-flow finance financial-data financial-statement financial-statements income-statement sec stock-ticker-symbol

4.7 match 131 stars 6.28 score 29 scripts

dataobservatory-eu

dataset:Create Data Frames that are Easier to Exchange and Reuse

The aim of the 'dataset' package is to make tidy datasets easier to release, exchange and reuse. It organizes and formats data frame 'R' objects into well-referenced, well-described, interoperable datasets into release and reuse ready form.

Maintained by Daniel Antal. Last updated 21 days ago.

dataset metadata-management

3.8 match 15 stars 7.81 score 76 scripts 1 dependents

cienciadedatos

dados:Translate Datasets to Portuguese

Este pacote traduz os seguintes conjuntos de dados: 'airlines', 'airports', 'ames_raw', 'AwardsManagers', 'babynames', 'Batting', 'diamonds', 'faithful', 'fueleconomy', 'Fielding', 'flights', 'gapminder', 'gss_cat', 'iris', 'Managers', 'mpg', 'mtcars', 'atmos', 'penguins', 'People, 'Pitching', 'pixarfilms','planes', 'presidential', 'table1', 'table2', 'table3', 'table4a', 'table4b', 'table5', 'vehicles', 'weather', 'who'. English: It provides a Portuguese translated version of the datasets listed above.

Maintained by Riva Quiroga. Last updated 7 months ago.

3.3 match 46 stars 7.13 score 266 scripts

lightbluetitan

MedDataSets:Comprehensive Medical, Disease, Treatment, and Drug Datasets

Provides an extensive collection of datasets related to medicine, diseases, treatments, drugs, and public health. This package covers topics such as drug effectiveness, vaccine trials, survival rates, infectious disease outbreaks, and medical treatments. The included datasets span various health conditions, including AIDS, cancer, bacterial infections, and COVID-19, along with information on pharmaceuticals and vaccines. These datasets are sourced from the R ecosystem and other R packages, remaining unaltered to ensure data integrity. This package serves as a valuable resource for researchers, analysts, and healthcare professionals interested in conducting medical and public health data analysis in R.

Maintained by Renzo Caceres Rossi. Last updated 5 months ago.

3.8 match 8 stars 5.68 score 60 scripts

simsem

semTools:Useful Tools for Structural Equation Modeling

Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.

Maintained by Terrence D. Jorgensen. Last updated 4 days ago.

1.5 match 79 stars 13.74 score 1.1k scripts 31 dependents

talgalili

dendextend:Extending 'dendrogram' Functionality in R

Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.

Maintained by Tal Galili. Last updated 2 months ago.

1.2 match 154 stars 17.02 score 6.0k scripts 164 dependents

edgararuiz

sparkxgb:Interface for 'XGBoost' on 'Apache Spark'

A 'sparklyr' <https://spark.posit.co/> extension that provides an R interface for 'XGBoost' <https://github.com/dmlc/xgboost> on 'Apache Spark'. 'XGBoost' is an optimized distributed gradient boosting library.

Maintained by Edgar Ruiz. Last updated 11 months ago.

9.1 match 2.23 score 34 scripts

tidymodels

probably:Tools for Post-Processing Predicted Values

Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.

Maintained by Max Kuhn. Last updated 5 months ago.

1.6 match 115 stars 12.09 score 21k scripts 1 dependents

tidymodels

tidypredict:Run Predictions Inside the Database

It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(), cubist(), and ctree() models.

Maintained by Emil Hvitfeldt. Last updated 3 months ago.

dbplyr dplyr purrr rlang

1.7 match 261 stars 11.03 score 241 scripts 2 dependents

iangow

farr:Data and Code for Financial Accounting Research

Handy functions and data to support a course book for accounting research. Gow, Ian D. and Tongqing Ding (2024) 'Empirical Research in Accounting: Tools and Methods' <https://iangow.github.io/far_book/>.

Maintained by Ian Gow. Last updated 1 months ago.

accounting finance

3.4 match 17 stars 5.05 score 66 scripts

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

1.9 match 3 stars 8.20 score 7.8k scripts 11 dependents

dwarton

ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)

Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.

Maintained by David Warton. Last updated 1 years ago.

2.3 match 8 stars 6.58 score 53 scripts

tidymodels

modeldb:Fits Models Inside the Database

Uses 'dplyr' and 'tidyeval' to fit statistical models inside the database. It currently supports KMeans and linear regression models.

Maintained by Max Kuhn. Last updated 1 years ago.

database dbplyr dplyr ggplot2 modeling rlang sql tidyeval visualization

1.7 match 79 stars 7.59 score 62 scripts

pharmaverse

sdtmchecks:Data Quality Checks for Study Data Tabulation Model (SDTM) Datasets

A series of checks to identify common issues in Study Data Tabulation Model (SDTM) datasets. These checks are intended to be generalizable, actionable, and meaningful for analysis.

Maintained by Will Harris. Last updated 3 months ago.

1.6 match 21 stars 7.66 score 15 scripts

jaredrecords

Autoseed:Retrieve Disease-Related Genes from Public Sources

For researchers to quickly and comprehensively acquire disease genes, so as to understand the mechanism of disease, we developed this program to acquire disease-related genes. The data is integrated from three public databases. The three databases are 'eDGAR', 'DrugBank' and 'MalaCards'. The 'eDGAR' is a comprehensive database, containing data on the relationship between disease and genes. 'DrugBank' contains information on 13443 drugs and 5157 targets. 'MalaCards' integrates human disease information, including disease-related genes.

Maintained by Jiawei Wu. Last updated 5 years ago.

8.0 match 1.48 score

gateslab

gimme:Group Iterative Multiple Model Estimation

Data-driven approach for arriving at person-specific time series models. The method first identifies which relations replicate across the majority of individuals to detect signal from noise. These group-level relations are then used as a foundation for starting the search for person-specific (or individual-level) relations. See Gates & Molenaar (2012) <doi:10.1016/j.neuroimage.2012.06.026>.

Maintained by Kathleen M Gates. Last updated 6 months ago.

1.5 match 26 stars 7.71 score 53 scripts

cran

optimStrat:Choosing the Sample Strategy

Intended to assist in the choice of the sampling strategy to implement in a survey.

Maintained by Edgar Bueno. Last updated 2 years ago.

10.4 match 1.00 score

gaborcsardi

rcorpora:A Collection of Small Text Corpora of Interesting Data

A collection of small text corpora of interesting data. It contains all data sets from 'dariusk/corpora'. Some examples: names of animals: birds, dinosaurs, dogs; foods: beer categories, pizza toppings; geography: English towns, rivers, oceans; humans: authors, US presidents, occupations; science: elements, planets; words: adjectives, verbs, proverbs, US president quotes.

Maintained by Gábor Csárdi. Last updated 7 years ago.

1.5 match 42 stars 6.83 score 36 scripts 2 dependents

lem-usp

evolqg:Evolutionary Quantitative Genetics

Provides functions for covariance matrix comparisons, estimation of repeatabilities in measurements and matrices, and general evolutionary quantitative genetics tools. Melo D, Garcia G, Hubbe A, Assis A P, Marroig G. (2016) <doi:10.12688/f1000research.7082.3>.

Maintained by Diogo Melo. Last updated 11 months ago.

openblas cpp

1.5 match 10 stars 6.26 score 114 scripts

rstudio

rscontract:Generic implementation of the 'RStudio' connections contract

Provides a generic implementation of the 'RStudio' connection contract to make it easier for database connections, and other type of connections, opened via R packages integrate with the connections pane inside the 'RStudio' interactive development environment (IDE).

Maintained by Nathan Stephens. Last updated 4 years ago.

connections-pane rstudio

1.7 match 22 stars 5.12 score 4 scripts 2 dependents

bioc

muscle:Multiple Sequence Alignment with MUSCLE

MUSCLE performs multiple sequence alignments of nucleotide or amino acid sequences.

Maintained by Alex T. Kalinka. Last updated 5 months ago.

multiplesequencealignment alignment sequencing genetics sequencematching dataimport cpp

1.7 match 5.21 score 81 scripts

nctingwang

merDeriv:Case-Wise and Cluster-Wise Derivatives for Mixed Effects Models

Compute case-wise and cluster-wise derivative for mixed effects models with respect to fixed effects parameter, random effect (co)variances, and residual variance. This material is partially based on work supported by the National Science Foundation under Grant Number 1460719.

Maintained by Ting Wang. Last updated 3 years ago.

1.6 match 3 stars 5.45 score 71 scripts 2 dependents

edgarsantos-fernandez

MPCI:Multivariate Process Capability Indices (MPCI)

It performs the followings Multivariate Process Capability Indices: Shahriari et al. (1995) Multivariate Capability Vector, Taam et al. (1993) Multivariate Capability Index (MCpm), Pan and Lee (2010) proposal (NMCpm) and the followings based on Principal Component Analysis (PCA):Wang and Chen (1998), Xekalaki and Perakis (2002) and Wang (2005). Two datasets are included.

Maintained by Edgar Santos-Fernandez. Last updated 9 years ago.

8.5 match 1.00 score 4 scripts

bfast2

strucchangeRcpp:Testing, Monitoring, and Dating Structural Changes: C++ Version

A fast implementation with additional experimental features for testing, monitoring and dating structural changes in (linear) regression models. 'strucchangeRcpp' features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g. cumulative/moving sum, recursive/moving estimates) and F statistics, respectively. These methods are described in Zeileis et al. (2002) <doi:10.18637/jss.v007.i02>. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals, and their magnitude as well as the model fit can be evaluated using a variety of statistical measures.

Maintained by Dainius Masiliunas. Last updated 5 months ago.

openblas cpp

1.5 match 5 stars 5.18 score 4 scripts 2 dependents

vochr

TapeR:Flexible Tree Taper Curves Based on Semiparametric Mixed Models

Implementation of functions for fitting taper curves (a semiparametric linear mixed effects taper model) to diameter measurements along stems. Further functions are provided to estimate the uncertainty around the predicted curves, to calculate timber volume (also by sections) and marginal (e.g., upper) diameters. For cases where tree heights are not measured, methods for estimating additional variance in volume predictions resulting from uncertainties in tree height models (tariffs) are provided. The example data include the taper curve parameters for Norway spruce used in the 3rd German NFI fitted to 380 trees and a subset of section-wise diameter measurements of these trees. The functions implemented here are detailed in Kublin, E., Breidenbach, J., Kaendler, G. (2013) <doi:10.1007/s10342-013-0715-0>.

Maintained by Christian Vonderach. Last updated 1 years ago.

1.7 match 4.38 score 16 scripts 1 dependents

vochr

TapeS:Tree Taper Curves and Sorting Based on 'TapeR'

Providing new german-wide 'TapeR' Models and functions for their evaluation. Included are the most common tree species in Germany (Norway spruce, Scots pine, European larch, Douglas fir, Silver fir as well as European beech, Common/Sessile oak and Red oak). Many other species are mapped to them so that 36 tree species / groups can be processed. Single trees are defined by species code, one or multiple diameters in arbitrary measuring height and tree height. The functions then provide information on diameters along the stem, bark thickness, height of diameters, volume of the total or parts of the trunk and total and component above-ground biomass. It is also possible to calculate assortments from the taper curves. Uncertainty information is provided for diameter, volume and component biomass estimation.

Maintained by Christian Vonderach. Last updated 1 months ago.

openblas cpp

1.7 match 3.90 score 1 scripts

vochr

rBDAT:Implementation of BDAT Tree Taper Fortran Functions

Implementing the BDAT tree taper Fortran routines, which were developed for the German National Forest Inventory (NFI), to calculate diameters, volume, assortments, double bark thickness and biomass for different tree species based on tree characteristics and sorting information. See Kublin (2003) <doi:10.1046/j.1439-0337.2003.00183.x> for details.

Maintained by Christian Vonderach. Last updated 6 months ago.

fortran

1.6 match 4.00 score 8 scripts

happma

pseudorank:Pseudo-Ranks

Efficient calculation of pseudo-ranks and (pseudo)-rank based test statistics. In case of equal sample sizes, pseudo-ranks and mid-ranks are equal. When used for inference mid-ranks may lead to paradoxical results. Pseudo-ranks are in general not affected by such a problem. See Happ et al. (2020, <doi:10.18637/jss.v095.c01>) for details.

Maintained by Martin Happ. Last updated 27 days ago.

cpp nonparametric nonparametric-statistics pseudo-rank pseudo-ranks rank rank-tests trend-test cpp

1.6 match 3 stars 3.71 score 17 scripts

happma

WMWssp:Wilcoxon-Mann-Whitney Sample Size Planning

Calculates the minimal sample size for the Wilcoxon-Mann-Whitney test that is needed for a given power and two sided type I error rate. The method works for metric data with and without ties, count data, ordered categorical data, and even dichotomous data. But data is needed for the reference group to generate synthetic data for the treatment group based on a relevant effect. See Happ et al. (2019, <doi:10.1002/sim.7983>) for details.

Maintained by Martin Happ. Last updated 27 days ago.

nonparametric-statistic optimal-design sample-size-calculation wilcoxon-mann-whitney-test

1.6 match 2 stars 3.60 score 8 scripts

cran

nparLD:Nonparametric Analysis of Longitudinal Data in Factorial Experiments

Performs nonparametric analysis of longitudinal data in factorial experiments. Longitudinal data are those which are collected from the same subjects over time, and they frequently arise in biological sciences. Nonparametric methods do not require distributional assumptions, and are applicable to a variety of data types (continuous, discrete, purely ordinal, and dichotomous). Such methods are also robust with respect to outliers and for small sample sizes.

Maintained by Frank Konietschke. Last updated 3 years ago.

1.6 match 4 stars 3.31 score 51 scripts

winvector

rquery:Relational Query Generator for Data Manipulation at Scale

A piped query generator based on Edgar F. Codd's relational algebra, and on production experience using 'SQL' and 'dplyr' at big data scale. The design represents an attempt to make 'SQL' more teachable by denoting composition by a sequential pipeline notation instead of nested queries or functions. The implementation delivers reliable high performance data processing on large data systems such as 'Spark', databases, and 'data.table'. Package features include: data processing trees or pipelines as observable objects (able to report both columns produced and columns used), optimized 'SQL' generation as an explicit user visible table modeling step, plus explicit query reasoning and checking.

Maintained by John Mount. Last updated 2 years ago.

0.5 match 110 stars 9.53 score 126 scripts 3 dependents

cran

rankFD:Rank-Based Tests for General Factorial Designs

The rankFD() function calculates the Wald-type statistic (WTS) and the ANOVA-type statistic (ATS) for nonparametric factorial designs, e.g., for count, ordinal or score data in a crossed design with an arbitrary number of factors. Brunner, E., Bathke, A. and Konietschke, F. (2018) <doi:10.1007/978-3-030-02914-2>.

Maintained by Frank Konietschke. Last updated 3 years ago.

1.7 match 1.30 score