Showing 35 of total 35 results (show query)
data-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 30 days ago.
419 stars 12.39 score 448 scripts 8 dependentsr-cas
Ryacas0:Legacy 'Ryacas' (Interface to 'Yacas' Computer Algebra System)
A legacy version of 'Ryacas', an interface to the 'yacas' computer algebra system (<http://www.yacas.org/>).
Maintained by Mikkel Meyer Andersen. Last updated 2 years ago.
2 stars 7.11 score 36 scripts 6 dependentsmoore-institute-4-plastic-pollution-res
One4All:Validate, Share, and Download Data
Designed to enhance data validation and management processes by employing a set of functions that read a set of rules from a 'CSV' or 'Excel' file and apply them to a dataset. Funded by the National Renewable Energy Laboratory and Possibility Lab, maintained by the Moore Institute for Plastic Pollution Research.
Maintained by Hannah Sherrod. Last updated 9 months ago.
3 stars 6.33 score 15 scriptsgraphdr
formatdown:Formatting Numbers in 'rmarkdown' Documents
Provides a small set of tools for formatting numbers in R-markdown documents. Convert a numerical vector to character strings in power-of-ten form, decimal form, or measurement-units form; all are math-delimited for rendering as inline equations. Can also convert text into math-delimited text to match the font face and size of math-delimited numbers. Useful for rendering single numbers in inline R code chunks and for rendering columns in tables.
Maintained by Richard Layton. Last updated 10 months ago.
8 stars 6.29 score 27 scriptsdata-cleaning
dcmodify:Modify Data Using Externally Defined Modification Rules
Data cleaning scripts typically contain a lot of 'if this change that' type of statements. Such statements are typically condensed expert knowledge. With this package, such 'data modifying rules' are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities.
Maintained by Mark van der Loo. Last updated 10 months ago.
10 stars 6.24 score 58 scriptsropensci
dwctaxon:Edit and Validate Darwin Core Taxon Data
Edit and validate taxonomic data in compliance with Darwin Core standards (Darwin Core 'Taxon' class <https://dwc.tdwg.org/terms/#taxon>).
Maintained by Joel H. Nitta. Last updated 9 months ago.
6 stars 6.13 score 28 scriptsdata-cleaning
errorlocate:Locate Errors with Validation Rules
Errors in data can be located and removed using validation rules from package 'validate'. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, chapter 7.
Maintained by Edwin de Jonge. Last updated 10 months ago.
data-cleaningerrorsinvalidation
22 stars 6.11 score 59 scriptsmalaga-fca-group
fcaR:Formal Concept Analysis
Provides tools to perform fuzzy formal concept analysis, presented in Wille (1982) <doi:10.1007/978-3-642-01815-2_23> and in Ganter and Obiedkov (2016) <doi:10.1007/978-3-662-49291-8>. It provides functions to load and save a formal context, extract its concept lattice and implications. In addition, one can use the implications to compute semantic closures of fuzzy sets and, thus, build recommendation systems.
Maintained by Domingo Lopez Rodriguez. Last updated 2 years ago.
6 stars 6.02 score 70 scriptsbioc
REMP:Repetitive Element Methylation Prediction
Machine learning-based tools to predict DNA methylation of locus-specific repetitive elements (RE) by learning surrounding genetic and epigenetic information. These tools provide genomewide and single-base resolution of DNA methylation prediction on RE that are difficult to measure using array-based or sequencing-based platforms, which enables epigenome-wide association study (EWAS) and differentially methylated region (DMR) analysis on RE.
Maintained by Yinan Zheng. Last updated 5 months ago.
dnamethylationmicroarraymethylationarraysequencinggenomewideassociationepigeneticspreprocessingmultichanneltwochanneldifferentialmethylationqualitycontroldataimport
2 stars 5.94 score 18 scriptscost-fp1304-profound
ProfoundData:Downloading and Exploring Data from the PROFOUND Database
Provides an R interface for the PROFOUND database <doi:10.5880/PIK.2019.008>. The PROFOUND database contains a wide range of data to evaluate vegetation models and simulate climate impacts at the forest stand scale. It includes 9 forest sites across Europe, and provides for them a site description as well as soil, climate, CO2, Nitrogen deposition, tree-level, forest stand-level and remote sensing data. Moreover, for a subset of 5 sites, also time series of carbon fluxes, energy balances and soil water are available.
Maintained by Florian Hartig. Last updated 5 years ago.
9 stars 5.58 score 14 scriptsbyzheng
rtiddlywiki:R Interface for 'TiddlyWiki'
'TiddlyWiki' is a unique non-linear notebook for capturing, organising and sharing complex information. 'rtiddlywiki' is a R interface of 'TiddlyWiki' <https://tiddlywiki.com> to create new tiddler from Rmarkdown file, and then put into a local 'TiddlyWiki' node.js server if it is available.
Maintained by Bangyou Zheng. Last updated 9 days ago.
3 stars 5.45 score 7 scriptsbyzheng
weaana:Analysis the Weather Data
Functions are collected to analyse weather data for agriculture purposes including to read weather records in multiple formats, calculate extreme climate index.
Maintained by Bangyou Zheng. Last updated 2 months ago.
3 stars 5.32 score 23 scripts 1 dependentsbioc
seahtrue:Seahtrue revives XF data for structured data analysis
Seahtrue organizes oxygen consumption and extracellular acidification analysis data from experiments performed on an XF analyzer into structured nested tibbles.This allows for detailed processing of raw data and advanced data visualization and statistics. Seahtrue introduces an open and reproducible way to analyze these XF experiments. It uses file paths to .xlsx files. These .xlsx files are supplied by the userand are generated by the user in the Wave software from Agilent from the assay result files (.asyr). The .xlsx file contains different sheets of important data for the experiment; 1. Assay Information - Details about how the experiment was set up. 2. Rate Data - Information about the OCR and ECAR rates. 3. Raw Data - The original raw data collected during the experiment. 4. Calibration Data - Data related to calibrating the instrument. Seahtrue focuses on getting the specific data needed for analysis. Once this data is extracted, it is prepared for calculations through preprocessing. To make sure everything is accurate, both the initial data and the preprocessed data go through thorough checks.
Maintained by Vincent de Boer. Last updated 5 months ago.
cellbasedassaysfunctionalpredictiondatarepresentationdataimportcellbiologycheminformaticsmetabolomicsmicrotitreplateassayvisualizationqualitycontrolbatcheffectexperimentaldesignpreprocessinggo
5.00 score 2 scriptsarminstroebel
atable:Create Tables for Reporting Clinical Trials
Create Tables for Reporting Clinical Trials. Calculates descriptive statistics and hypothesis tests, arranges the results in a table ready for reporting with LaTeX, HTML or Word.
Maintained by Armin Ströbel. Last updated 4 years ago.
9 stars 4.74 score 41 scriptsdata-cleaning
validatetools:Checking and Simplifying Validation Rule Sets
Rule sets with validation rules may contain redundancies or contradictions. Functions for finding redundancies and problematic rules are provided, given a set a rules formulated with 'validate'.
Maintained by Edwin de Jonge. Last updated 10 months ago.
15 stars 4.47 score 39 scriptsfernandalschumacher
skewlmm:Scale Mixture of Skew-Normal Linear Mixed Models
It fits scale mixture of skew-normal linear mixed models using either an expectation–maximization (EM) type algorithm or its accelerated version (Damped Anderson Acceleration with Epsilon Monotonicity, DAAREM), including some possibilities for modeling the within-subject dependence. Details can be found in Schumacher, Lachos and Matos (2021) <doi:10.1002/sim.8870>.
Maintained by Fernanda L. Schumacher. Last updated 2 months ago.
6 stars 4.43 score 10 scriptsdata-cleaning
validatesuggest:Generate Suggestions for Validation Rules
Generate suggestions for validation rules from a reference data set, which can be used as a starting point for domain specific rules to be checked with package 'validate'.
Maintained by Edwin de Jonge. Last updated 1 years ago.
5 stars 4.40 score 5 scriptsfishr-core-team
RFishBC:Back-Calculation of Fish Length
Helps fisheries scientists collect measurements from calcified structures and back-calculate estimated lengths at previous ages using standard procedures and models. This is intended to replace much of the functionality provided by the now out-dated 'fishBC' software (<https://fisheries.org/bookstore/all-titles/software/70317/>).
Maintained by Derek H. Ogle. Last updated 1 years ago.
fishfisheriesfisheries-managementfisheries-stock-assessmentpopulation-dynamicsstock-assessment
13 stars 4.26 score 28 scriptsdata-cleaning
deductive:Data Correction and Imputation Using Deductive Methods
Attempt to repair inconsistencies and missing values in data records by using information from valid values and validation rules restricting the data.
Maintained by Mark van der Loo. Last updated 2 months ago.
14 stars 4.26 score 13 scriptspr2database
pr2database:PR2 database with shiny web interface
PR2 database See https://pr2-database.org
Maintained by Daniel Vaulot. Last updated 1 hours ago.
18s-rrnadatabaseeukaryotesmetabarcodingrrnataxonomy
80 stars 3.90 score 7 scriptsnenuial
ggeo:Themes and Helpers for ggplot2
This package provides helper functions for ggplot graphs and maps.
Maintained by Pascal Burkhard. Last updated 1 months ago.
1 stars 3.52 score 2 dependentsmarkvanderloo
rspa:Adapt Numerical Records to Fit (in)Equality Restrictions
Minimally adjust the values of numerical records in a data.frame, such that each record satisfies a predefined set of equality and/or inequality constraints. The constraints can be defined using the 'validate' package. The core algorithms have recently been moved to the 'lintools' package, refer to 'lintools' for a more basic interface and access to a version of the algorithm that works with sparse matrices.
Maintained by Mark van der Loo. Last updated 10 months ago.
3 stars 3.45 score 19 scriptsleef-uzh
bemovi.LEEF:BEMOVI, software for extracting BEhaviour and MOrphology from VIdeos. This version is adapted for LEEF-UZH
An R and ImageJ based work flow to automatically measure behaviour and morphology from videos. Moving individuals are identified by background subtraction, morphology extracted, and trajectories assembled through time from coordinate data. Abundance, morphology and behaviour can be summarized based on trajectory data.
Maintained by Rainer M Krug. Last updated 1 years ago.
3.32 score 2 dependentsgiocomai
zoteror:Access the Zotero API in R
zoteror provides tools to access the Zotero API.
Maintained by Giorgio Comai. Last updated 6 days ago.
37 stars 3.27 score 5 scriptsnenuial
geographer:Geography Vizualisations
Provides function and objects to establish vizualisations for my Geography lessons.
Maintained by Pascal Burkhard. Last updated 1 months ago.
2 stars 3.08 scoreleef-uzh
LEEF:Data Package Containing Only Data and Data Information
Setup package for the LEEF pipeline which loads / installs all necessary packages and functions to run the pipeline.
Maintained by Rainer M. Krug. Last updated 3 years ago.
data-analysisdata-processingleef
2.95 scorebyzheng
expDB:Database for Experiment Dataset
A 'SQLite' database is designed to store all information of experiment-based data including metadata, experiment design, managements, phenotypic values and climate records. The dataset can be imported from an 'Excel' file.
Maintained by Bangyou Zheng. Last updated 1 years ago.
2.70 score 4 scriptsgermanrecordlinkage
PPRL:Privacy Preserving Record Linkage
A toolbox for deterministic, probabilistic and privacy-preserving record linkage techniques. Combines the functionality of the 'Merge ToolBox' (<https://www.record-linkage.de>) with current privacy-preserving techniques.
Maintained by Dorothea Rukasz. Last updated 2 years ago.
2 stars 2.64 score 22 scriptsnenuial
geovizr:Support for Knitr (Quarto/Rmd)
Provide support functions for Quarto and Rmd documents.
Maintained by Pascal Burkhard. Last updated 1 months ago.
2.60 score 3 scriptsleef-uzh
LEEF.measurement.bemovi:Prepares Movies for Analysis with Bemovi and Extracts Data
Module for the LEEF pipeline to process bemovi data.
Maintained by Rainer M. Krug. Last updated 3 years ago.
1.48 score 1 dependentscran
ARpLMEC:Censored Mixed-Effects Models with Different Correlation Structures
Left, right or interval censored mixed-effects linear model with autoregressive errors of order p or DEC correlation structure using the type-EM algorithm. The error distribution can be Normal or t-Student. It provides the parameter estimates, the standard errors and prediction of future observations (available only for the normal case). Olivari et all (2021) <doi:10.1080/10543406.2020.1852246>.
Maintained by Rommy C. Olivari. Last updated 3 years ago.
1.00 scoreh-rabiee
RJalaliDate:Handling Jalali Date (Persian / Solar Hijri)
Jalali calendar, or solar Hijri, is calendar of Iran and Afghanistan (<https://en.wikipedia.org/wiki/Solar_Hijri_calendar>). This package is designed to working with Jalali date. For this purpose, It defines JalaliDate class that is similar to Date class.
Maintained by Hosein Rabiee. Last updated 7 months ago.
1.00 score 1 scripts