R-universe search: spectral information

tidymodels

broom:Convert Statistical Objects into Tidy Tibbles

Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.

Maintained by Simon Couch. Last updated 4 months ago.

modeling tidy-data

65.8 match 1.5k stars 21.56 score 37k scripts 1.4k dependents

elbersb

tidylog:Logging for 'dplyr' and 'tidyr' Functions

Provides feedback about 'dplyr' and 'tidyr' operations.

Maintained by Benjamin Elbers. Last updated 9 months ago.

dplyr tidyr tidyverse wrapper-functions

121.8 match 593 stars 10.23 score 1.7k scripts

aphalo

photobiology:Photobiological Calculations

Definitions of classes, methods, operators and functions for use in photobiology and radiation meteorology and climatology. Calculation of effective (weighted) and not-weighted irradiances/doses, fluence rates, transmittance, reflectance, absorptance, absorbance and diverse ratios and other derived quantities from spectral data. Local maxima and minima: peaks, valleys and spikes. Conversion between energy-and photon-based units. Wavelength interpolation. Astronomical calculations related solar angles and day length. Colours and vision. This package is part of the 'r4photobiology' suite, Aphalo, P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 5 days ago.

light photobiology quantification r4photobiology-suite radiation spectra sun-position

95.3 match 4 stars 9.35 score 604 scripts 12 dependents

choonghyunryu

dlookr:Tools for Data Diagnosis, Exploration, Transformation

A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values, outliers, and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and the relationship between the target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputes missing values and outliers, and resolves skewness. And it creates automated reports that support these three tasks.

Maintained by Choonghyun Ryu. Last updated 9 months ago.

77.6 match 212 stars 11.05 score 748 scripts 2 dependents

proteomicslab57357

UniprotR:Retrieving Information of Proteins from Uniprot

Connect to Uniprot <https://www.uniprot.org/> to retrieve information about proteins using their accession number such information could be name or taxonomy information, For detailed information kindly read the publication <https://www.sciencedirect.com/science/article/pii/S1874391919303859>.

Maintained by Mohamed Soudy. Last updated 3 years ago.

103.6 match 61 stars 7.65 score 89 scripts 1 dependents

bioc

SemDist:Information Accretion-based Function Predictor Evaluation

This package implements methods to calculate information accretion for a given version of the gene ontology and uses this data to calculate remaining uncertainty, misinformation, and semantic similarity for given sets of predicted annotations and true annotations from a protein function predictor.

Maintained by Ian Gonzalez. Last updated 5 months ago.

classification annotation go software

118.8 match 1 stars 4.30 score 3 scripts

klarsen1

Information:Data Exploration with Information Theory (Weight-of-Evidence and Information Value)

Performs exploratory data analysis and variable screening for binary classification models using weight-of-evidence (WOE) and information value (IV). In order to make the package as efficient as possible, aggregations are done in data.table and creation of WOE vectors can be distributed across multiple cores. The package also supports exploration for uplift models (NWOE and NIV).

Maintained by Larsen Kim. Last updated 9 years ago.

66.9 match 44 stars 7.45 score 118 scripts

ozancanozdemir

turkeyelections:The Most Comprehensive R Package for Turkish Election Results

Includes the results of general, local, and presidential elections held in Turkey between 1995 and 2023, broken down by provinces and overall national results. It facilitates easy processing of this data and the creation of visual representations based on these election results.

Maintained by Ozancan Ozdemir. Last updated 9 months ago.

116.1 match 15 stars 4.18 score 1 scripts

aphalo

photobiologyLEDs:Spectral Data for Light-Emitting-Diodes

Spectral emission data for some frequently used light emitting diodes available as electronic components. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 4 months ago.

82.9 match 2 stars 5.13 score 34 scripts

aphalo

photobiologyLamps:Spectral Irradiance Data for Lamps

Spectral emission data for some frequently used lamps including bulbs and flashlights based on led emitting diodes (LEDs) but excluding LEDs available as electronic components. Original spectral irradiance data for incandescent-, LED- and discharge lamps are included. They are complemented by data on the effect of temperature on the emission by fluorescent tubes. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 4 months ago.

89.6 match 4.58 score 38 scripts

martin3141

spant:MR Spectroscopy Analysis Tools

Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.

Maintained by Martin Wilson. Last updated 1 months ago.

brain mri mrs mrshub spectroscopy fortran

46.7 match 25 stars 8.52 score 81 scripts

spsanderson

TidyDensity:Functions for Tidy Analysis and Generation of Random Data

To make it easy to generate random numbers based upon the underlying stats distribution functions. All data is returned in a tidy and structured format making working with the data simple and straight forward. Given that the data is returned in a tidy 'tibble' it lends itself to working with the rest of the 'tidyverse'.

Maintained by Steven Sanderson. Last updated 5 months ago.

bootstrap density distributions ggplot2 probability r-language simulation statistics tibble tidy

51.0 match 34 stars 7.78 score 66 scripts 1 dependents

bioc

Biobase:Biobase: Base functions for Bioconductor

Functions that are needed by many other packages or which replace R functions.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

23.9 match 9 stars 16.45 score 6.6k scripts 1.8k dependents

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 11 months ago.

158.6 match 2.28 score 38 scripts

neotomadb

neotoma2:Working with the Neotoma Paleoecology Database

Access and manipulation of data using the Neotoma Paleoecology Database. <https://api.neotomadb.org/api-docs/>.

Maintained by Dominguez Vidana Socorro. Last updated 8 months ago.

earthcube neotoma nsf paleoecology

62.0 match 8 stars 5.35 score 56 scripts

statistikat

VIM:Visualization and Imputation of Missing Values

New tools for the visualization of missing and/or imputed values are introduced, which can be used for exploring the data and the structure of the missing and/or imputed values. Depending on this structure of the missing values, the corresponding methods may help to identify the mechanism generating the missing values and allows to explore the data including missing values. In addition, the quality of imputation can be visually explored using various univariate, bivariate, multiple and multivariate plot methods. A graphical user interface available in the separate package VIMGUI allows an easy handling of the implemented plot methods.

Maintained by Matthias Templ. Last updated 7 months ago.

hotdeck imputation-methods model-predictions visualization cpp

21.3 match 85 stars 14.44 score 2.6k scripts 19 dependents

charlie86

spotifyr:R Wrapper for the 'Spotify' Web API

An R wrapper for pulling data from the 'Spotify' Web API <https://developer.spotify.com/documentation/web-api/> in bulk, or post items on a 'Spotify' user's playlist.

Maintained by Daniel Antal. Last updated 5 months ago.

music-information-retrieval spotify

35.3 match 374 stars 8.54 score 936 scripts

gshs-ornl

wbstats:Programmatic Access to Data and Statistics from the World Bank API

Search and download data from the World Bank Data API.

Maintained by Jesse Piburn. Last updated 4 years ago.

open-data world-bank world-bank-api worldbank

26.0 match 126 stars 10.06 score 1.1k scripts 3 dependents

cran

wavethresh:Wavelets Statistics and Transforms

Performs 1, 2 and 3D real and complex-valued wavelet transforms, nondecimated transforms, wavelet packet transforms, nondecimated wavelet packet transforms, multiple wavelet transforms, complex-valued wavelet transforms, wavelet shrinkage for various kinds of data, locally stationary wavelet time series, nonstationary multiscale transfer function modeling, density estimation.

Maintained by Guy Nason. Last updated 7 months ago.

41.7 match 5.89 score 41 dependents

ms609

TreeDist:Calculate and Map Distances Between Phylogenetic Trees

Implements measures of tree similarity, including information-based generalized Robinson-Foulds distances (Phylogenetic Information Distance, Clustering Information Distance, Matching Split Information Distance; Smith 2020) <doi:10.1093/bioinformatics/btaa614>; Jaccard-Robinson-Foulds distances (Bocker et al. 2013) <doi:10.1007/978-3-642-40453-5_13>, including the Nye et al. (2006) metric <doi:10.1093/bioinformatics/bti720>; the Matching Split Distance (Bogdanowicz & Giaro 2012) <doi:10.1109/TCBB.2011.48>; Maximum Agreement Subtree distances; the Kendall-Colijn (2016) distance <doi:10.1093/molbev/msw124>, and the Nearest Neighbour Interchange (NNI) distance, approximated per Li et al. (1996) <doi:10.1007/3-540-61332-3_168>. Includes tools for visualizing mappings of tree space (Smith 2022) <doi:10.1093/sysbio/syab100>, for identifying islands of trees (Silva and Wilkinson 2021) <doi:10.1093/sysbio/syab015>, for calculating the median of sets of trees, and for computing the information content of trees and splits.

Maintained by Martin R. Smith. Last updated 1 months ago.

phylogenetics tree-distance phylogenetic-trees tree-distances trees cpp

23.4 match 32 stars 10.32 score 97 scripts 5 dependents

abarbour

psd:Adaptive, Sine-Multitaper Power Spectral Density and Cross Spectrum Estimation

Produces power spectral density estimates through iterative refinement of the optimal number of sine-tapers at each frequency. This optimization procedure is based on the method of Riedel and Sidorenko (1995), which minimizes the Mean Square Error (sum of variance and bias) at each frequency, but modified for computational stability. The same procedure can now be used to calculate the cross spectrum (multivariate analyses).

Maintained by Andrew J. Barbour. Last updated 2 years ago.

multitaper power-spectral-density power-spectrum psd spectral-density-estimates spectrum openblas cpp

33.6 match 9 stars 7.12 score 122 scripts 1 dependents

mbq

praznik:Tools for Information-Based Feature Selection and Scoring

A toolbox of fast, native and parallel implementations of various information-based importance criteria estimators and feature selection filters based on them, inspired by the overview by Brown, Pocock, Zhao and Lujan (2012) <https://www.jmlr.org/papers/v13/brown12a.html>. Contains, among other, minimum redundancy maximal relevancy ('mRMR') method by Peng, Long and Ding (2005) <doi:10.1109/TPAMI.2005.159>; joint mutual information ('JMI') method by Yang and Moody (1999) <https://papers.nips.cc/paper/1779-data-visualization-and-feature-selection-new-algorithms-for-nongaussian-data>; double input symmetrical relevance ('DISR') method by Meyer and Bontempi (2006) <doi:10.1007/11732242_9> as well as joint mutual information maximisation ('JMIM') method by Bennasar, Hicks and Setchi (2015) <doi:10.1016/j.eswa.2015.07.007>.

Maintained by Miron B. Kursa. Last updated 2 years ago.

openmp

47.3 match 5.05 score 34 scripts 6 dependents

bleutner

RStoolbox:Remote Sensing Data Analysis

Toolbox for remote sensing image processing and analysis such as calculating spectral indexes, principal component transformation, unsupervised and supervised classification or fractional cover analyses.

Maintained by Konstantin Mueller. Last updated 1 months ago.

ggplot2 land-cover-mapping remote-sensing spectral-unmixing supervised-classification unsupervised-classification openblas cpp

22.5 match 275 stars 10.10 score 1.1k scripts

bioc

RAIDS:Accurate Inference of Genetic Ancestry from Cancer Sequences

This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.

Maintained by Pascal Belleau. Last updated 5 months ago.

genetics software sequencing wholegenome principalcomponent geneticvariability dimensionreduction biocviews ancestry cancer-genomics exome-sequencing genomics inference r-language rna-seq rna-sequencing whole-genome-sequencing

36.2 match 5 stars 6.23 score 19 scripts

rstudio

rstudioapi:Safely Access the RStudio API

Access the RStudio API (if available) and provide informative error messages when it's not.

Maintained by Kevin Ushey. Last updated 4 months ago.

11.9 match 172 stars 18.81 score 3.6k scripts 2.1k dependents

neonscience

neonUtilities:Utilities for Working with NEON Data

NEON data packages can be accessed through the NEON Data Portal <https://www.neonscience.org> or through the NEON Data API (see <https://data.neonscience.org/data-api> for documentation). Data delivered from the Data Portal are provided as monthly zip files packaged within a parent zip file, while individual files can be accessed from the API. This package provides tools that aid in discovering, downloading, and reformatting data prior to use in analyses. This includes downloading data via the API, merging data tables by type, and converting formats. For more information, see the readme file at <https://github.com/NEONScience/NEON-utilities>.

Maintained by Claire Lunch. Last updated 1 months ago.

21.0 match 57 stars 10.66 score 944 scripts 15 dependents

ips-lmu

emuR:Main Package of the EMU Speech Database Management System

Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.

Maintained by Markus Jochim. Last updated 1 years ago.

32.0 match 24 stars 6.89 score 135 scripts 1 dependents

drostlab

philentropy:Similarity and Distance Quantification Between Probability Functions

Computes 46 optimized distance and similarity measures for comparing probability functions (Drost (2018) <doi:10.21105/joss.00765>). These comparisons between probability functions have their foundations in a broad range of scientific disciplines from mathematics to ecology. The aim of this package is to provide a core framework for clustering, classification, statistical inference, goodness-of-fit, non-parametric statistics, information theory, and machine learning tasks that are based on comparing univariate or multivariate probability functions.

Maintained by Hajk-Georg Drost. Last updated 3 months ago.

distance-measures distance-quantification information-theory jensen-shannon-divergence parametric-distributions similarity-measures statistics cpp

17.6 match 137 stars 12.44 score 484 scripts 24 dependents

philchalmers

mirt:Multidimensional Item Response Theory

Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.

Maintained by Phil Chalmers. Last updated 13 days ago.

irt mirt openblas cpp openmp

14.4 match 210 stars 14.98 score 2.5k scripts 40 dependents

ropensci

paleobioDB:Download and Process Data from the Paleobiology Database

Includes functions to wrap most endpoints of the 'PaleobioDB' API and functions to visualize and process the fossil data. The API documentation for the Paleobiology Database can be found at <https://paleobiodb.org/data1.2/>.

Maintained by Adrián Castro Insua. Last updated 1 years ago.

34.7 match 42 stars 6.19 score 74 scripts

aphalo

ggspectra:Extensions to 'ggplot2' for Radiation Spectra

Additional annotations, stats, geoms and scales for plotting "light" spectra with 'ggplot2', together with specializations of ggplot() and autoplot() methods for spectral data and waveband definitions stored in objects of classes defined in package 'photobiology'. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 4 days ago.

dataviz ggplot2-autoplot ggplot2-enhancementes ggplot2-geoms ggplot2-scales ggplot2-stats light r4photobiology-suite radiation spectra

25.6 match 5 stars 8.09 score 390 scripts 1 dependents

r-lib

sessioninfo:R Session Information

Query and print information about the current R session. It is similar to 'utils::sessionInfo()', but includes more information about packages, and where they were installed from.

Maintained by Gábor Csárdi. Last updated 1 months ago.

13.7 match 76 stars 14.57 score 4.3k scripts 237 dependents

ropensci

tidyhydat:Extract and Tidy Canadian 'Hydrometric' Data

Provides functions to access historical and real-time national 'hydrometric' data from Water Survey of Canada data sources (<https://dd.weather.gc.ca/hydrometric/csv/> and <https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>) and then applies tidy data principles.

Maintained by Sam Albers. Last updated 7 days ago.

citz government-data hydrology hydrometrics tidy-data water-resources

20.8 match 71 stars 9.59 score 202 scripts 3 dependents

ropensci

neotoma:Access to the Neotoma Paleoecological Database Through R

NOTE: This package is deprecated. Please use the neotoma2 package described at https://github.com/NeotomaDB/neotoma2. Access paleoecological datasets from the Neotoma Paleoecological Database using the published API (<http://wnapi.neotomadb.org/>), only containing datasets uploaded prior to June 2020. The functions in this package access various pre-built API functions and attempt to return the results from Neotoma in a usable format for researchers and the public.

Maintained by Simon J. Goring. Last updated 2 years ago.

neotoma neotoma-apis neotoma-database nsf paleoecology

38.3 match 30 stars 5.04 score 145 scripts

bioc

signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis

This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.

Maintained by Brendan Gongol. Last updated 5 months ago.

software geneexpression go kegg networkenrichment sequencing coverage differentialexpression cpp

26.1 match 17 stars 7.18 score 74 scripts 1 dependents

openintrostat

openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs

Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.

Maintained by Mine Çetinkaya-Rundel. Last updated 3 months ago.

data openintro

16.3 match 240 stars 11.39 score 6.0k scripts

bioc

TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data

The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.

Maintained by Tiago Chedraoui Silva. Last updated 29 days ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network sequencing survival software bioc bioconductor gdc integrative-analysis tcga tcga-data tcgabiolinks

12.8 match 305 stars 14.45 score 1.6k scripts 6 dependents

stan-dev

loo:Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models

Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.

Maintained by Jonah Gabry. Last updated 5 days ago.

bayes bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics cross-validation information-criterion model-comparison stan

10.6 match 152 stars 17.30 score 2.6k scripts 297 dependents

florianstijven

Surrogate:Evaluation of Surrogate Endpoints in Clinical Trials

In a clinical trial, it frequently occurs that the most credible outcome to evaluate the effectiveness of a new therapy (the true endpoint) is difficult to measure. In such a situation, it can be an effective strategy to replace the true endpoint by a (bio)marker that is easier to measure and that allows for a prediction of the treatment effect on the true endpoint (a surrogate endpoint). The package 'Surrogate' allows for an evaluation of the appropriateness of a candidate surrogate endpoint based on the meta-analytic, information-theoretic, and causal-inference frameworks. Part of this software has been developed using funding provided from the European Union's Seventh Framework Programme for research, technological development and demonstration (Grant Agreement no 602552), the Special Research Fund (BOF) of Hasselt University (BOF-number: BOF2OCPO3), GlaxoSmithKline Biologicals, Baekeland Mandaat (HBC.2022.0145), and Johnson & Johnson Innovative Medicine.

Maintained by Wim Van Der Elst. Last updated 20 days ago.

28.5 match 1 stars 6.42 score 133 scripts

easystats

insight:Easy Access to Model Information for Various Model Objects

A tool to provide an easy, intuitive and consistent access to information contained in various R models, like model formulas, model terms, information about random effects, data that was used to fit the model or data from response variables. 'insight' mainly revolves around two types of functions: Functions that find (the names of) information, starting with 'find_', and functions that get the underlying data, starting with 'get_'. The package has a consistent syntax and works with many different model objects, where otherwise functions to access these information are missing.

Maintained by Daniel Lüdecke. Last updated 1 days ago.

easystats hacktoberfest insight models names predictors random

10.4 match 412 stars 17.24 score 568 scripts 211 dependents

bioc

Cardinal:A mass spectrometry imaging toolbox for statistical analysis

Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.

Maintained by Kylie Ariel Bemis. Last updated 3 months ago.

software infrastructure proteomics lipidomics massspectrometry imagingmassspectrometry immunooncology normalization clustering classification regression

17.4 match 47 stars 10.34 score 200 scripts

wincowgerdev

OpenSpecy:Analyze, Process, Identify, and Share Raman and (FT)IR Spectra

Raman and (FT)IR spectral analysis tool for plastic particles and other environmental samples (Cowger et al. 2021, <doi:10.1021/acs.analchem.1c00123>). With read_any(), Open Specy provides a single function for reading individual, batch, or map spectral data files like .asp, .csv, .jdx, .spc, .spa, .0, and .zip. process_spec() simplifies processing spectra, including smoothing, baseline correction, range restriction and flattening, intensity conversions, wavenumber alignment, and min-max normalization. Spectra can be identified in batch using an onboard reference library (Cowger et al. 2020, <doi:10.1177/0003702820929064>) using match_spec(). A Shiny app is available via run_app() or online at <https://openanalysis.org/openspecy/>.

Maintained by Win Cowger. Last updated 19 days ago.

23.7 match 29 stars 7.58 score 22 scripts

aphalo

photobiologyFilters:Spectral Transmittance and Spectral Reflectance Data

Spectral 'transmittance' data for frequently used filters and similar materials. Plastic sheets and films; photography filters; theatrical gels; machine-vision filters; various types of window glass; optical glass and some laboratory plastics and glassware. Spectral reflectance data for frequently encountered materials. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 8 days ago.

34.9 match 5.08 score 40 scripts

tobiaskley

quantspec:Quantile-Based Spectral Analysis of Time Series

Methods to determine, smooth and plot quantile periodograms for univariate and multivariate time series.

Maintained by Tobias Kley. Last updated 9 years ago.

cpp

29.9 match 10 stars 5.84 score 46 scripts 1 dependents

aphalo

photobiologyPlants:Plant Photobiology Related Functions and Data

Provides functions for quantifying visible (VIS) and ultraviolet (UV) radiation in relation to the photoreceptors Phytochromes, Cryptochromes, and UVR8 which are present in plants. It also includes data sets on the optical properties of plants. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 2 months ago.

31.6 match 5.52 score 55 scripts

bioc

MoonlightR:Identify oncogenes and tumor suppressor genes from omics data

Motivation: The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). Results: We present an R/bioconductor package called MoonlightR which returns a list of candidate driver genes for specific cancer types on the basis of TCGA expression data. The method first infers gene regulatory networks and then carries out a functional enrichment analysis (FEA) (implementing an upstream regulator analysis, URA) to score the importance of well-known biological processes with respect to the studied cancer type. Eventually, by means of random forests, MoonlightR predicts two specific roles for the candidate driver genes: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, MoonlightR can be used to discover OCGs and TSGs in the same cancer type. This may help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV) in breast cancer. In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments.

Maintained by Matteo Tiberti. Last updated 5 months ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network survival genesetenrichment networkenrichment

26.4 match 17 stars 6.57 score

bemts-hhs

nemsqar:National Emergency Medical Service Quality Alliance Measure Calculations

Designed to automate the calculation of Emergency Medical Service (EMS) quality metrics, 'nemsqar' implements measures defined by the National EMS Quality Alliance (NEMSQA). By providing reliable, evidence-based quality assessments, the package supports EMS agencies, healthcare providers, and researchers in evaluating and improving patient outcomes. Users can find details on all approved NEMSQA measures at <https://www.nemsqa.org/measures>. Full technical specifications, including documentation and pseudocode used to develop 'nemsqar', are available on the NEMSQA website after creating a user profile at <https://www.nemsqa.org>.

Maintained by Nicolas Foss. Last updated 5 days ago.

ems nemsqa quality trauma

36.5 match 5 stars 4.70 score

l-ramirez-lopez

resemble:Memory-Based Learning in Spectral Chemometrics

Functions for dissimilarity analysis and memory-based learning (MBL, a.k.a local modeling) in complex spectral data sets. Most of these functions are based on the methods presented in Ramirez-Lopez et al. (2013) <doi:10.1016/j.geoderma.2012.12.014>.

Maintained by Leonardo Ramirez-Lopez. Last updated 2 years ago.

chemoinformatics chemometrics infrared-spectroscopy lazy-learning local-regression machine-learning memory-based-learning nir pedometrics soil-spectroscopy spectral-data spectral-library spectroscopy openblas cpp openmp

28.7 match 20 stars 5.91 score 27 scripts

kisungyou

Rdimtools:Dimension Reduction and Estimation Methods

We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.

Maintained by Kisung You. Last updated 2 years ago.

dimension-estimation dimension-reduction manifold-learning subspace-learning openblas cpp openmp

20.3 match 52 stars 8.37 score 186 scripts 8 dependents

bhklab

mRMRe:Parallelized Minimum Redundancy, Maximum Relevance (mRMR)

Computes mutual information matrices from continuous, categorical and survival variables, as well as feature selection with minimum redundancy, maximum relevance (mRMR) and a new ensemble mRMR technique. Published in De Jay et al. (2013) <doi:10.1093/bioinformatics/btt383>.

Maintained by Benjamin Haibe-Kains. Last updated 4 years ago.

cpp openmp

18.9 match 19 stars 8.95 score 105 scripts 2 dependents

cran

datarobot:'DataRobot' Predictive Modeling API

For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>.

Maintained by AJ Alon. Last updated 1 years ago.

48.3 match 2 stars 3.48 score

barcaroli

SamplingStrata:Optimal Stratification of Sampling Frames for Multipurpose Sampling Surveys

In the field of stratified sampling design, this package offers an approach for the determination of the best stratification of a sampling frame, the one that ensures the minimum sample cost under the condition to satisfy precision constraints in a multivariate and multidomain case. This approach is based on the use of the genetic algorithm: each solution (i.e. a particular partition in strata of the sampling frame) is considered as an individual in a population; the fitness of all individuals is evaluated applying the Bethel-Chromy algorithm to calculate the sampling size satisfying precision constraints on the target estimates. Functions in the package allows to: (a) analyse the obtained results of the optimisation step; (b) assign the new strata labels to the sampling frame; (c) select a sample from the new frame accordingly to the best allocation. Functions for the execution of the genetic algorithm are a modified version of the functions in the 'genalg' package.

Maintained by Giulio Barcaroli. Last updated 16 days ago.

cpp

21.9 match 15 stars 7.60 score 178 scripts

bioc

biovizBase:Basic graphic utilities for visualization of genomic data.

The biovizBase package is designed to provide a set of utilities, color schemes and conventions for genomic data. It serves as the base for various high-level packages for biological data visualization. This saves development effort and encourages consistency.

Maintained by Michael Lawrence. Last updated 5 months ago.

infrastructure visualization preprocessing

20.7 match 8.04 score 273 scripts 75 dependents

seil85

spectral:Common Methods of Spectral Data Analysis

On discrete data spectral analysis is performed by Fourier and Hilbert transforms as well as with model based analysis called Lomb-Scargle method. Fragmented and irregularly spaced data can be processed in almost all methods. Both, FFT as well as LOMB methods take multivariate data and return standardized PSD. For didactic reasons an analytical approach for deconvolution of noise spectra and sampling function is provided. A user friendly interface helps to interpret the results.

Maintained by Martin Seilmayer. Last updated 4 years ago.

58.5 match 2.81 score 36 scripts 1 dependents

nimble-dev

nimble:MCMC, Particle Filtering, and Programmable Hierarchical Modeling

A system for writing hierarchical statistical models largely compatible with 'BUGS' and 'JAGS', writing nimbleFunctions to operate models and do basic R-style math, and compiling both models and nimbleFunctions via custom-generated C++. 'NIMBLE' includes default methods for MCMC, Laplace Approximation, Monte Carlo Expectation Maximization, and some other tools. The nimbleFunction system makes it easy to do things like implement new MCMC samplers from R, customize the assignment of samplers to different parts of a model from R, and compile the new samplers automatically via C++ alongside the samplers 'NIMBLE' provides. 'NIMBLE' extends the 'BUGS'/'JAGS' language by making it extensible: New distributions and functions can be added, including as calls to external compiled code. Although most people think of MCMC as the main goal of the 'BUGS'/'JAGS' language for writing models, one can use 'NIMBLE' for writing arbitrary other kinds of model-generic algorithms as well. A full User Manual is available at <https://r-nimble.org>.

Maintained by Christopher Paciorek. Last updated 6 days ago.

bayesian-inference bayesian-methods hierarchical-models mcmc probabilistic-programming openblas cpp

12.6 match 169 stars 12.97 score 2.6k scripts 19 dependents

aphalo

ooacquire:Acquire Data from OO Spectrometers

Functions to acquire data directly from Ocean Optics spectrometers, and functions to read similar data from files. Functions to convert raw-counts into counts-per-second and physical quantities. Data are saved in objects of classes defined in package 'photobiology'. The instrument settings, instrument description, date-time of acquisition and optionally goecode are stored as attributes.

Maintained by Pedro J. Aphalo. Last updated 3 months ago.

data-acquisition data-import r4photobiology spectra cpp

31.7 match 1 stars 5.17 score 93 scripts

inbo

inbodb:Connect to and Retrieve Data from Databases on the INBO Server

A bundle of functions to connect to and retrieve data from databases on the INBO server, with dedicated functions to query some of these databases.

Maintained by Els Lommelen. Last updated 27 days ago.

database

26.3 match 6.16 score 114 scripts 1 dependents

ropensci

webchem:Chemical Information from the Web

Chemical information from around the web. This package interacts with a suite of web services for chemical information. Sources include: Alan Wood's Compendium of Pesticide Common Names, Chemical Identifier Resolver, ChEBI, Chemical Translation Service, ChemSpider, ETOX, Flavornet, NIST Chemistry WebBook, OPSIN, PubChem, SRS, Wikidata.

Maintained by Tamás Stirling. Last updated 3 months ago.

cas-number chemical-information chemspider identifier ropensci webscraping

15.1 match 165 stars 10.31 score 173 scripts 10 dependents

bioc

BiocFHIR:Illustration of FHIR ingestion and transformation using R

FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.

Maintained by Vincent Carey. Last updated 5 months ago.

infrastructure dataimport datarepresentation fhir

26.9 match 4 stars 5.78 score 15 scripts

ouhscbbmc

REDCapR:Interaction Between R and REDCap

Encapsulates functions to streamline calls from R to the REDCap API. REDCap (Research Electronic Data CAPture) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The Application Programming Interface (API) offers an avenue to access and modify data programmatically, improving the capacity for literate and reproducible programming.

Maintained by Will Beasley. Last updated 2 months ago.

redcap redcap-api

12.3 match 118 stars 12.36 score 438 scripts 6 dependents

safetygraphics

safetyCharts:Charts for Monitoring Clinical Trial Safety

Contains chart code for monitoring clinical trial safety. Charts can be used as standalone output, but are also designed for use with the 'safetyGraphics' package, which makes it easy to load data and customize the charts using an interactive web-based interface created with Shiny.

Maintained by Jeremy Wildfire. Last updated 6 months ago.

28.4 match 9 stars 5.36 score 21 scripts 1 dependents

stuart-lab

Signac:Analysis of Single-Cell Chromatin Data

A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.

Maintained by Tim Stuart. Last updated 7 months ago.

atac bioinformatics single-cell zlib cpp

12.5 match 349 stars 12.19 score 3.7k scripts 1 dependents

troyhernandez

tinyspotifyr:Tinyverse R Wrapper for the 'Spotify' Web API

An R wrapper for the 'Spotify' Web API <https://developer.spotify.com/web-api/>.

Maintained by Troy Hernandez. Last updated 1 years ago.

31.5 match 13 stars 4.81 score 5 scripts

jpquast

protti:Bottom-Up Proteomics and LiP-MS Quality Control and Data Analysis Tools

Useful functions and workflows for proteomics quality control and data analysis of both limited proteolysis-coupled mass spectrometry (LiP-MS) (Feng et. al. (2014) <doi:10.1038/nbt.2999>) and regular bottom-up proteomics experiments. Data generated with search tools such as 'Spectronaut', 'MaxQuant' and 'Proteome Discover' can be easily used due to flexibility of functions.

Maintained by Jan-Philipp Quast. Last updated 5 months ago.

data-analysis lip-ms mass-spectrometry omics protein proteomics systems-biology

17.5 match 61 stars 8.58 score 83 scripts

bioc

motifbreakR:A Package For Predicting The Disruptiveness Of Single Nucleotide Polymorphisms On Transcription Factor Binding Sites

We introduce motifbreakR, which allows the biologist to judge in the first place whether the sequence surrounding the polymorphism is a good match, and in the second place how much information is gained or lost in one allele of the polymorphism relative to another. MotifbreakR is both flexible and extensible over previous offerings; giving a choice of algorithms for interrogation of genomes with motifs from public sources that users can choose from; these are 1) a weighted-sum probability matrix, 2) log-probabilities, and 3) weighted by relative entropy. MotifbreakR can predict effects for novel or previously described variants in public databases, making it suitable for tasks beyond the scope of its original design. Lastly, it can be used to interrogate any genome curated within Bioconductor (currently there are 32 species, a total of 109 versions).

Maintained by Simon Gert Coetzee. Last updated 5 months ago.

chipseq visualization motifannotation transcription

16.7 match 28 stars 8.96 score 103 scripts

thongphamthe

PAFit:Generative Mechanism Estimation in Temporal Complex Networks

Statistical methods for estimating preferential attachment and node fitness generative mechanisms in temporal complex networks are provided. Thong Pham et al. (2015) <doi:10.1371/journal.pone.0137796>. Thong Pham et al. (2016) <doi:10.1038/srep32558>. Thong Pham et al. (2020) <doi:10.18637/jss.v092.i03>. Thong Pham et al. (2021) <doi:10.1093/comnet/cnab024>.

Maintained by Thong Pham. Last updated 12 months ago.

complex-networks fit-get-richer general-preferential-attachment minorize-maximization preferential-attachment rich-get-richer scale-free temporal-networks cpp openmp

23.0 match 17 stars 6.47 score 70 scripts

chiliubio

microeco:Microbial Community Ecology Data Analysis

A series of statistical and plotting approaches in microbial community ecology based on the R6 class. The classes are designed for data preprocessing, taxa abundance plotting, alpha diversity analysis, beta diversity analysis, differential abundance test, null model analysis, network analysis, machine learning, environmental data analysis and functional analysis.

Maintained by Chi Liu. Last updated 7 days ago.

14.7 match 219 stars 10.11 score 211 scripts 3 dependents

luomus

finbif:Interface for the 'Finnish Biodiversity Information Facility' API

A programmatic interface to the 'Finnish Biodiversity Information Facility' ('FinBIF') API (<https://api.laji.fi>). 'FinBIF' aggregates Finnish biodiversity data from multiple sources in a single open access portal for researchers, citizen scientists, industry and government. 'FinBIF' allows users of biodiversity information to find, access, combine and visualise data on Finnish plants, animals and microorganisms. The 'finbif' package makes the publicly available data in 'FinBIF' easily accessible to programmers. Biodiversity information is available on taxonomy and taxon occurrence. Occurrence data can be filtered by taxon, time, location and other variables. The data accessed are conveniently preformatted for subsequent analyses.

Maintained by William K. Morris. Last updated 8 days ago.

api biodiversity biodiversity-informatics biodiversity-information finbif finbif-access occurrences r-programming species specimens taxon taxonomy web-services

17.9 match 5 stars 8.15 score 42 scripts 3 dependents

bioc

derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach

This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq chipseq differentialpeakcalling software immunooncology coverage annotation-agnostic bioconductor derfinder

14.5 match 42 stars 10.03 score 78 scripts 6 dependents

mlampros

fastText:Efficient Learning of Word Representations and Sentence Classification

An interface to the 'fastText' <https://github.com/facebookresearch/fastText> library for efficient learning of word representations and sentence classification. The 'fastText' algorithm is explained in detail in (i) "Enriching Word Vectors with subword Information", Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov, 2017, <doi:10.1162/tacl_a_00051>; (ii) "Bag of Tricks for Efficient Text Classification", Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov, 2017, <doi:10.18653/v1/e17-2068>; (iii) "FastText.zip: Compressing text classification models", Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Herve Jegou, Tomas Mikolov, 2016, <arXiv:1612.03651>.

Maintained by Lampros Mouselimis. Last updated 1 years ago.

cpp11 fasttext cpp

19.6 match 42 stars 7.37 score 56 scripts

viralemergence

insectDisease:Ecological Database of the World's Insect Pathogens

David Onstad provided us with this insect disease database, sometimes referred to as the 'Ecological Database of the Worlds Insect Pathogens' or EDWIP. Files have been converted from 'SQL' to csv, and ported into 'R' for easy exploration and analysis. Thanks to the Macroecology of Infectious Disease Research Coordination Network (RCN) for funding and support. Data are also served online in a static format at <https://edwip.ecology.uga.edu/>.

Maintained by Tad Dallas. Last updated 2 months ago.

32.7 match 13 stars 4.41 score 2 scripts

billpetti

baseballr:Acquiring and Analyzing Baseball Data

Provides numerous utilities for acquiring and analyzing baseball data from online sources such as 'Baseball Reference' <https://www.baseball-reference.com/>, 'FanGraphs' <https://www.fangraphs.com/>, and the 'MLB Stats' API <https://www.mlb.com/>.

Maintained by Saiem Gilani. Last updated 4 months ago.

baseball pitchfx sabermetrics statcast

16.0 match 380 stars 8.98 score 582 scripts

framverse

framrosetta:FRAM LUTs and mappings

Look-up tables and convenience functions for working with FRAM tables.

Maintained by Ty Garber. Last updated 2 months ago.

36.1 match 1 stars 3.95 score 3 scripts 1 dependents

christopherkenny

congress:Access the Congress.gov API

Download and read data on United States congressional proceedings. Data is read from the Library of Congress's Congress.gov Application Programming Interface (<https://github.com/LibraryOfCongress/api.congress.gov/>). Functions exist for all version 3 endpoints, including for bills, amendments, congresses, summaries, members, reports, communications, nominations, and treaties.

Maintained by Christopher T. Kenny. Last updated 8 months ago.

32.7 match 14 stars 4.35 score 16 scripts

n8thangreen

BCEA:Bayesian Cost Effectiveness Analysis

Produces an economic evaluation of a sample of suitable variables of cost and effectiveness / utility for two or more interventions, e.g. from a Bayesian model in the form of MCMC simulations. This package computes the most cost-effective alternative and produces graphical summaries and probabilistic sensitivity analysis, see Baio et al (2017) <doi:10.1007/978-3-319-55718-2>.

Maintained by Gianluca Baio. Last updated 2 months ago.

bayesian cost-effectiveness

14.3 match 3 stars 9.90 score 243 scripts 3 dependents

rich-iannone

DiagrammeR:Graph/Network Visualization

Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.

Maintained by Richard Iannone. Last updated 2 months ago.

graph graph-functions network-graph property-graph visualization

9.3 match 1.7k stars 15.18 score 3.8k scripts 87 dependents

bioc

Informeasure:R implementation of information measures

This package consolidates a comprehensive set of information measurements, encompassing mutual information, conditional mutual information, interaction information, partial information decomposition, and part mutual information.

Maintained by Chu Pan. Last updated 5 months ago.

geneexpression networkinference network software

31.2 match 3 stars 4.48 score 4 scripts

reichlab

zoltr:Interface to the 'Zoltar' Forecast Repository API

'Zoltar' <https://www.zoltardata.com/> is a website that provides a repository of model forecast results in a standardized format and a central location. It supports storing, retrieving, comparing, and analyzing time series forecasts for prediction challenges of interest to the modeling community. This package provides functions for working with the 'Zoltar' API, including connecting and authenticating, getting meta information (projects, models, and forecasts, and truth), and uploading, downloading, and deleting forecast and truth data.

Maintained by Matthew Cornell. Last updated 12 days ago.

18.4 match 2 stars 7.58 score 175 scripts 3 dependents

bioc

oligo:Preprocessing tools for oligonucleotide arrays

A package to analyze oligonucleotide arrays (expression/SNP/tiling/exon) at probe-level. It currently supports Affymetrix (CEL files) and NimbleGen arrays (XYS files).

Maintained by Benilton Carvalho. Last updated 10 days ago.

microarray onechannel twochannel preprocessing snp differentialexpression exonarray geneexpression dataimport zlib

13.2 match 3 stars 10.42 score 528 scripts 10 dependents

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 1 months ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

11.6 match 55 stars 11.90 score 1.2k scripts 2 dependents

bioc

biomaRt:Interface to BioMart databases (i.e. Ensembl)

In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.

Maintained by Mike Smith. Last updated 4 days ago.

annotation bioconductor biomart ensembl

8.6 match 38 stars 15.99 score 13k scripts 230 dependents

bioc

ChIPseeker:ChIPseeker for ChIP peak Annotation, Comparison, and Visualization

This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation chipseq software visualization multiplecomparison atac-seq chip-seq comparison epigenetics epigenomics

10.5 match 234 stars 13.02 score 1.6k scripts 5 dependents

dmurdoch

rgl:3D Visualization Using OpenGL

Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.). Output may be on screen using OpenGL, or to various standard 3D file formats including WebGL, PLY, OBJ, STL as well as 2D image formats, including PNG, Postscript, SVG, PGF.

Maintained by Duncan Murdoch. Last updated 2 months ago.

graphics opengl rgl webgl libglu libglvnd libpng libx11 freetype cpp

7.8 match 91 stars 17.49 score 7.3k scripts 300 dependents

acorg

Racmacs:Antigenic Cartography Macros

A toolkit for making antigenic maps from immunological assay data, in order to quantify and visualize antigenic differences between different pathogen strains as described in Smith et al. (2004) <doi:10.1126/science.1097211> and used in the World Health Organization influenza vaccine strain selection process. Additional functions allow for the diagnostic evaluation of antigenic maps and an interactive viewer is provided to explore antigenic relationships amongst several strains and incorporate the visualization of associated genetic information.

Maintained by Sam Wilks. Last updated 9 months ago.

openblas cpp openmp

16.7 match 21 stars 8.06 score 362 scripts

conjugateprior

twfy:Drive the API for TheyWorkForYou

An R wrapper around the API of TheyWorkForYou, a parliamentary monitoring site that scrapes and repackages Hansard (the UK's parliamentary record) and augments it with information from the Register of Members' Interests, election results, and voting records to provide a unified source of information about UK legislators and their activities. See <http://www.theyworkforyou.com> for details.

Maintained by Will Lowe. Last updated 6 years ago.

28.9 match 9 stars 4.65 score 3 scripts

bioc

msPurity:Automated Evaluation of Precursor Ion Purity for Mass Spectrometry Based Fragmentation in Metabolomics

msPurity R package was developed to: 1) Assess the spectral quality of fragmentation spectra by evaluating the "precursor ion purity". 2) Process fragmentation spectra. 3) Perform spectral matching. What is precursor ion purity? -What we call "Precursor ion purity" is a measure of the contribution of a selected precursor peak in an isolation window used for fragmentation. The simple calculation involves dividing the intensity of the selected precursor peak by the total intensity of the isolation window. When assessing MS/MS spectra this calculation is done before and after the MS/MS scan of interest and the purity is interpolated at the recorded time of the MS/MS acquisition. Additionally, isotopic peaks can be removed, low abundance peaks are removed that are thought to have limited contribution to the resulting MS/MS spectra and the isolation efficiency of the mass spectrometer can be used to normalise the intensities used for the calculation.

Maintained by Thomas N. Lawson. Last updated 5 months ago.

massspectrometry metabolomics software bioconductor-package dims fragmentation lc-ms lc-msms mass-spectrometry precursor-ion-purity

18.9 match 15 stars 7.03 score 44 scripts

tjetka

SLEMI:Statistical Learning Based Estimation of Mutual Information

The implementation of the algorithm for estimation of mutual information and channel capacity from experimental data by classification procedures (logistic regression). Technically, it allows to estimate information-theoretic measures between finite-state input and multivariate, continuous output. Method described in Jetka et al. (2019) <doi:10.1371/journal.pcbi.1007132>.

Maintained by Tomasz Jetka. Last updated 1 years ago.

channel-capacity information-theory logistic-regression mutual-information-estimation

26.7 match 4 stars 4.92 score 21 scripts

yuimaproject

yuima:The YUIMA Project Package for SDEs

Simulation and Inference for SDEs and Other Stochastic Processes.

Maintained by Stefano M. Iacus. Last updated 5 days ago.

openblas cpp

18.0 match 9 stars 7.26 score 92 scripts 2 dependents

covaruber

sommer:Solving Mixed Model Equations in R

Structural multivariate-univariate linear mixed model solver for estimation of multiple random effects with unknown variance-covariance structures (e.g., heterogeneous and unstructured) and known covariance among levels of random effects (e.g., pedigree and genomic relationship matrices) (Covarrubias-Pazaran, 2016 <doi:10.1371/journal.pone.0156744>; Maier et al., 2015 <doi:10.1016/j.ajhg.2014.12.006>; Jensen et al., 1997). REML estimates can be obtained using the Direct-Inversion Newton-Raphson and Direct-Inversion Average Information algorithms for the problems r x r (r being the number of records) or using the Henderson-based average information algorithm for the problem c x c (c being the number of coefficients to estimate). Spatial models can also be fitted using the two-dimensional spline functionality available.

Maintained by Giovanny Covarrubias-Pazaran. Last updated 24 days ago.

average-information mixed-models rcpparmadillo openblas cpp openmp

10.3 match 43 stars 12.70 score 300 scripts 9 dependents

ropensci

lightr:Read Spectrometric Data and Metadata

Parse various reflectance/transmittance/absorbance spectra file formats to extract spectral data and metadata, as described in Gruson, White & Maia (2019) <doi:10.21105/joss.01857>. Among other formats, it can import files from 'Avantes' <https://www.avantes.com/>, 'CRAIC' <https://www.microspectra.com/>, and 'OceanOptics'/'OceanInsight' <https://www.oceanoptics.com/> brands.

Maintained by Hugo Gruson. Last updated 1 months ago.

file-import reproducibility reproducible-research reproducible-science spectral-data spectroscopy

18.3 match 13 stars 7.11 score 11 scripts 2 dependents

dynverse

dynwrap:Representing and Inferring Single-Cell Trajectories

Provides functionality to infer trajectories from single-cell data, represent them into a common format, and adapt them. Other biological information can also be added, such as cellular grouping, RNA velocity and annotation. Saelens et al. (2019) <doi:10.1038/s41587-019-0071-9>.

Maintained by Robrecht Cannoodt. Last updated 2 years ago.

trajectory-inference

17.2 match 16 stars 7.48 score 159 scripts 1 dependents

usepa

tcpl:ToxCast Data Analysis Pipeline

The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.

Maintained by Jason Brown. Last updated 5 days ago.

ccte comptox ord

13.7 match 36 stars 9.41 score 90 scripts

thibautjombart

adegenet:Exploratory Analysis of Genetic and Genomic Data

Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.

Maintained by Zhian N. Kamvar. Last updated 1 months ago.

10.2 match 182 stars 12.60 score 1.9k scripts 29 dependents

merck

metalite:ADaM Metadata Structure

A metadata structure for clinical data analysis and reporting based on Analysis Data Model (ADaM) datasets. The package simplifies clinical analysis and reporting tool development by defining standardized inputs, outputs, and workflow. The package can be used to create analysis and reporting planning grid, mock table, and validated analysis and reporting results based on consistent inputs.

Maintained by Yujie Zhao. Last updated 7 months ago.

cdisc clinical-trials metadata

14.1 match 15 stars 9.01 score 57 scripts 5 dependents

bioc

decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data

Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.

Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.

differentialexpression functionalgenomics geneexpression generegulation network software statisticalmethod transcription

11.2 match 230 stars 11.27 score 316 scripts 3 dependents

bioc

biodb:biodb, a library and a development framework for connecting to chemical and biological databases

The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.

Maintained by Pierrick Roger. Last updated 5 months ago.

software infrastructure dataimport kegg biology cheminformatics chemistry databases cpp

16.0 match 11 stars 7.85 score 24 scripts 6 dependents

hfgolino

EGAnet:Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics

Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.

Maintained by Hudson Golino. Last updated 1 days ago.

16.0 match 47 stars 7.83 score 61 scripts 1 dependents

bioc

cfTools:Informatics Tools for Cell-Free DNA Study

The cfTools R package provides methods for cell-free DNA (cfDNA) methylation data analysis to facilitate cfDNA-based studies. Given the methylation sequencing data of a cfDNA sample, for each cancer marker or tissue marker, we deconvolve the tumor-derived or tissue-specific reads from all reads falling in the marker region. Our read-based deconvolution algorithm exploits the pervasiveness of DNA methylation for signal enhancement, therefore can sensitively identify a trace amount of tumor-specific or tissue-specific cfDNA in plasma. cfTools provides functions for (1) cancer detection: sensitively detect tumor-derived cfDNA and estimate the tumor-derived cfDNA fraction (tumor burden); (2) tissue deconvolution: infer the tissue type composition and the cfDNA fraction of multiple tissue types for a plasma cfDNA sample. These functions can serve as foundations for more advanced cfDNA-based studies, including cancer diagnosis and disease monitoring.

Maintained by Ran Hu. Last updated 5 months ago.

software biomedicalinformatics epigenetics sequencing methylseq dnamethylation differentialmethylation cpp

23.2 match 7 stars 5.32 score 2 scripts

usepa

httk:High-Throughput Toxicokinetics

Pre-made models that can be rapidly tailored to various chemicals and species using chemical-specific in vitro data and physiological information. These tools allow incorporation of chemical toxicokinetics ("TK") and in vitro-in vivo extrapolation ("IVIVE") into bioinformatics, as described by Pearce et al. (2017) (<doi:10.18637/jss.v079.i04>). Chemical-specific in vitro data characterizing toxicokinetics have been obtained from relatively high-throughput experiments. The chemical-independent ("generic") physiologically-based ("PBTK") and empirical (for example, one compartment) "TK" models included here can be parameterized with in vitro data or in silico predictions which are provided for thousands of chemicals, multiple exposure routes, and various species. High throughput toxicokinetics ("HTTK") is the combination of in vitro data and generic models. We establish the expected accuracy of HTTK for chemicals without in vivo data through statistical evaluation of HTTK predictions for chemicals where in vivo data do exist. The models are systems of ordinary differential equations that are developed in MCSim and solved using compiled (C-based) code for speed. A Monte Carlo sampler is included for simulating human biological variability (Ring et al., 2017 <doi:10.1016/j.envint.2017.06.004>) and propagating parameter uncertainty (Wambaugh et al., 2019 <doi:10.1093/toxsci/kfz205>). Empirically calibrated methods are included for predicting tissue:plasma partition coefficients and volume of distribution (Pearce et al., 2017 <doi:10.1007/s10928-017-9548-7>). These functions and data provide a set of tools for using IVIVE to convert concentrations from high-throughput screening experiments (for example, Tox21, ToxCast) to real-world exposures via reverse dosimetry (also known as "RTK") (Wetmore et al., 2015 <doi:10.1093/toxsci/kfv171>).

Maintained by John Wambaugh. Last updated 1 months ago.

comptox ord

12.0 match 27 stars 10.22 score 307 scripts 1 dependents

hneth

riskyr:Rendering Risk Literacy more Transparent

Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.

Maintained by Hansjoerg Neth. Last updated 10 months ago.

2x2-matrix bayesian-inference contingency-table representation risk risk-literacy visualization

16.6 match 19 stars 7.36 score 80 scripts

uscbiostats

partition:Agglomerative Partitioning Framework for Dimension Reduction

A fast and flexible framework for agglomerative partitioning. 'partition' uses an approach called Direct-Measure-Reduce to create new variables that maintain the user-specified minimum level of information. Each reduced variable is also interpretable: the original variables map to one and only one variable in the reduced data set. 'partition' is flexible, as well: how variables are selected to reduce, how information loss is measured, and the way data is reduced can all be customized. 'partition' is based on the Partition framework discussed in Millstein et al. (2020) <doi:10.1093/bioinformatics/btz661>.

Maintained by Malcolm Barrett. Last updated 4 months ago.

data-reduction dimensionality-reduction partitional-clustering openblas cpp

15.6 match 36 stars 7.72 score 27 scripts 1 dependents

rmaia

pavo:Perceptual Analysis, Visualization and Organization of Spectral Colour Data

A cohesive framework for the spectral and spatial analysis of colour described in Maia, Eliason, Bitton, Doucet & Shawkey (2013) <doi:10.1111/2041-210X.12069> and Maia, Gruson, Endler & White (2019) <doi:10.1111/2041-210X.13174>.

Maintained by Thomas White. Last updated 1 months ago.

12.4 match 72 stars 9.72 score 151 scripts 1 dependents

bioc

GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style

Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.

Maintained by Hervé Pagès. Last updated 2 months ago.

genetics datarepresentation annotation genomeannotation bioconductor-package core-package

7.3 match 32 stars 16.46 score 1.3k scripts 1.7k dependents

e-sensing

sits:Satellite Image Time Series Analysis for Earth Observation Data Cubes

An end-to-end toolkit for land use and land cover classification using big Earth observation data, based on machine learning methods applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>. Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, Brazil Data Cube, Copernicus Data Space Environment (CDSE), Digital Earth Africa, Digital Earth Australia, NASA HLS using the Spatio-temporal Asset Catalog (STAC) protocol (<https://stacspec.org/>) and the 'gdalcubes' R package developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>. Supports visualization methods for images and time series and smoothing filters for dealing with noisy time series. Includes functions for quality assessment of training samples using self-organized maps as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. Includes methods to reduce training samples imbalance proposed by Chawla et al (2002) <doi:10.1613/jair.953>. Provides machine learning methods including support vector machines, random forests, extreme gradient boosting, multi-layer perceptrons, temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, and temporal attention encoders by Garnot and Landrieu (2020) <doi:10.48550/arXiv.2007.00586>. Supports GPU processing of deep learning models using torch <https://torch.mlverse.org/>. Performs efficient classification of big Earth observation data cubes and includes functions for post-classification smoothing based on Bayesian inference as described by Camara et al (2024) <doi:10.3390/rs16234572>, and methods for active learning and uncertainty assessment. Supports region-based time series analysis using package supercells <https://jakubnowosad.com/supercells/>. Enables best practices for estimating area and assessing accuracy of land change as recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>. Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.

Maintained by Gilberto Camara. Last updated 1 months ago.

big-earth-data cbers earth-observation eo-datacubes geospatial image-time-series land-cover-classification landsat planetary-computer r-spatial remote-sensing rspatial satellite-image-time-series satellite-imagery sentinel-2 stac-api stac-catalog cpp

12.6 match 494 stars 9.50 score 384 scripts

philchalmers

SimDesign:Structure for Organizing Monte Carlo Simulation Designs

Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.

Maintained by Phil Chalmers. Last updated 1 days ago.

monte-carlo-simulation simulation simulation-framework

8.9 match 62 stars 13.38 score 253 scripts 46 dependents

rstudio

pointblank:Data Validation and Organization of Metadata for Local and Remote Tables

Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.

Maintained by Richard Iannone. Last updated 11 days ago.

data-assertions data-checker data-dictionaries data-frames data-inference data-management data-profiler data-quality data-validation data-verification database-tables easy-to-understand reporting-tool schema-validation testing-tools yaml-configuration

11.2 match 932 stars 10.59 score 284 scripts

jinseob2kim

jstable:Create Tables from Different Types of Regression

Create regression tables from generalized linear model(GLM), generalized estimating equation(GEE), generalized linear mixed-effects model(GLMM), Cox proportional hazards model, survey-weighted generalized linear model(svyglm) and survey-weighted Cox model results for publication.

Maintained by Jinseob Kim. Last updated 2 days ago.

label regression table

11.6 match 27 stars 10.10 score 199 scripts 1 dependents

chjackson

flexsurv:Flexible Parametric Survival and Multi-State Models

Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.

Maintained by Christopher Jackson. Last updated 2 months ago.

cpp

8.8 match 57 stars 13.31 score 632 scripts 43 dependents

bioc

SIMAT:GC-SIM-MS data processing and alaysis tool

This package provides a pipeline for analysis of GC-MS data acquired in selected ion monitoring (SIM) mode. The tool also provides a guidance in choosing appropriate fragments for the targets of interest by using an optimization algorithm. This is done by considering overlapping peaks from a provided library by the user.

Maintained by M. R. Nezami Ranjbar. Last updated 5 months ago.

immunooncology software metabolomics massspectrometry

27.4 match 4.26 score 1 scripts

atahk

pscl:Political Science Computational Laboratory

Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching; seats-votes curves.

Maintained by Simon Jackman. Last updated 1 years ago.

8.7 match 67 stars 13.28 score 2.7k scripts 54 dependents

darth-git

dampack:Decision-Analytic Modeling Package

A suite of functions for analyzing and visualizing the health economic outputs of mathematical models. This package was developed with funding from the National Institutes of Allergy and Infectious Diseases of the National Institutes of Health under award no. R01AI138783. The content of this package is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The theoretical underpinnings of 'dampack''s functionality are detailed in Hunink et al. (2014) <doi:10.1017/CBO9781139506779>.

Maintained by David Garibay. Last updated 6 months ago.

13.7 match 35 stars 8.35 score 161 scripts

jeromeecoac

seewave:Sound Analysis and Synthesis

Functions for analysing, manipulating, displaying, editing and synthesizing time waves (particularly sound). This package processes time analysis (oscillograms and envelopes), spectral content, resonance quality factor, entropy, cross correlation and autocorrelation, zero-crossing, dominant frequency, analytic signal, frequency coherence, 2D and 3D spectrograms and many other analyses. See Sueur et al. (2008) <doi:10.1080/09524622.2008.9753600> and Sueur (2018) <doi:10.1007/978-3-319-77647-7>.

Maintained by Jerome Sueur. Last updated 1 years ago.

12.9 match 18 stars 8.88 score 880 scripts 23 dependents

majianthu

copent:Estimating Copula Entropy and Transfer Entropy

The nonparametric methods for estimating copula entropy, transfer entropy, and the statistics for multivariate normality test and two-sample test are implemented. The methods for estimating transfer entropy and the statistics for multivariate normality test and two-sample test are based on the method for estimating copula entropy. The method for change point detection with copula entropy based two-sample test is also implemented. Please refer to Ma and Sun (2011) <doi:10.1016/S1007-0214(11)70008-6>, Ma (2019) <doi:10.48550/arXiv.1910.04375>, Ma (2022) <doi:10.48550/arXiv.2206.05956>, Ma (2023) <doi:10.48550/arXiv.2307.07247>, and Ma (2024) <doi:10.48550/arXiv.2403.07892> for more information.

Maintained by MA Jian. Last updated 9 months ago.

causal-discovery causality change-point-detection conditional-independence-test conditional-mutual-information copula copula-entropy correlation entropy granger-causality information-theory mutual-information mutualinf normality-test transfer-entropy two-sample-test variable-selection

22.2 match 41 stars 5.15 score 23 scripts 1 dependents

rfhb

ctrdata:Retrieve and Analyze Clinical Trials in Public Registers

A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', <https://www.clinicaltrialsregister.eu/>), 'ClinicalTrials.gov' (<https://clinicaltrials.gov/> and also translating queries the retired classic interface), the 'ISRCTN' (<http://www.isrctn.com/>) and the 'European Union Clinical Trials Information System' ('CTIS', <https://euclinicaltrials.eu/>). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for meta-analysis and trend-analysis of the design and conduct as well as of the results of clinical trials across registers.

Maintained by Ralf Herold. Last updated 3 days ago.

clinical-data clinical-research clinical-studies clinical-trials ctgov database duckdb mongodb nodbi postgresql register sqlite studies trial

14.3 match 45 stars 7.92 score 32 scripts

meireles

spectrolab:Class and Methods for Spectral Data

Input/Output, processing and visualization of spectra taken with different spectrometers, including SVC (Spectra Vista), ASD and PSR (Spectral Evolution). Implements an S3 class spectra that other packages can build on. Provides methods to access, plot, manipulate, splice sensor overlap, vector normalize and smooth spectra.

Maintained by Jose Eduardo Meireles. Last updated 2 months ago.

science

15.3 match 16 stars 7.39 score 256 scripts

forestscientist

StemAnalysis:Reconstructing Tree Growth and Carbon Accumulation with Stem Analysis Data

Use stem analysis data to reconstructing tree growth and carbon accumulation. Users can independently or in combination perform a number of standard tasks for any tree species. (i) Age class determination. (ii) The cumulative growth, mean annual increment, and current annual increment of diameter at breast height (DBH) with bark, tree height, and stem volume with bark are estimated. (iii) Tree biomass and carbon storage estimation from volume and allometric models are calculated. (iv) Height-diameter relationship is fitted with nonlinear models, if diameter at breast height (DBH) or tree height are available, which can be used to retrieve tree height and diameter at breast height (DBH). <https://github.com/forestscientist/StemAnalysis>.

Maintained by Huili Wu. Last updated 2 years ago.

25.8 match 4 stars 4.38 score 12 scripts

adeverse

adespatial:Multivariate Multiscale Spatial Analysis

Tools for the multiscale spatial analysis of multivariate data. Several methods are based on the use of a spatial weighting matrix and its eigenvector decomposition (Moran's Eigenvectors Maps, MEM). Several approaches are described in the review Dray et al (2012) <doi:10.1890/11-1183.1>.

Maintained by Aurélie Siberchicot. Last updated 14 days ago.

openblas

10.2 match 36 stars 11.06 score 398 scripts 2 dependents

mikldk

malan:MAle Lineage ANalysis

MAle Lineage ANalysis by simulating genealogies backwards and imposing short tandem repeats (STR) mutations forwards. Intended for forensic Y chromosomal STR (Y-STR) haplotype analyses. Numerous analyses are possible, e.g. number of matches and meiotic distance to matches. Refer to papers mentioned in citation("malan") (DOI's: <doi:10.1371/journal.pgen.1007028>, <doi:10.21105/joss.00684> and <doi:10.1016/j.fsigen.2018.10.004>).

Maintained by Mikkel Meyer Andersen. Last updated 1 years ago.

openblas cpp

24.8 match 4.48 score 6 scripts

r-forge

RHRV:Heart Rate Variability Analysis of ECG Data

Allows users to import data files containing heartbeat positions in the most broadly used formats, to remove outliers or points with unacceptable physiological values present in the time series, to plot HRV data, and to perform time domain, frequency domain and nonlinear HRV analysis. See Garcia et al. (2017) <DOI:10.1007/978-3-319-65355-6>.

Maintained by Leandro Rodriguez-Linares. Last updated 6 months ago.

16.4 match 6.79 score 63 scripts 1 dependents

bioc

scDblFinder:scDblFinder

The scDblFinder package gathers various methods for the detection and handling of doublets/multiplets in single-cell sequencing data (i.e. multiple cells captured within the same droplet or reaction volume). It includes methods formerly found in the scran package, the new fast and comprehensive scDblFinder method, and a reimplementation of the Amulet detection method for single-cell ATAC-seq.

Maintained by Pierre-Luc Germain. Last updated 2 months ago.

preprocessing singlecell rnaseq atacseq doublets single-cell

9.0 match 184 stars 12.34 score 888 scripts 1 dependents

bioc

SBGNview:"SBGNview: Data Analysis, Integration and Visualization on SBGN Pathways"

SBGNview is a tool set for pathway based data visalization, integration and analysis. SBGNview is similar and complementary to the widely used Pathview, with the following key features: 1. Pathway definition by the widely adopted Systems Biology Graphical Notation (SBGN); 2. Supports multiple major pathway databases beyond KEGG (Reactome, MetaCyc, SMPDB, PANTHER, METACROP) and user defined pathways; 3. Covers 5,200 reference pathways and over 3,000 species by default; 4. Extensive graphics controls, including glyph and edge attributes, graph layout and sub-pathway highlight; 5. SBGN pathway data manipulation, processing, extraction and analysis.

Maintained by Weijun Luo. Last updated 5 months ago.

genetarget pathways graphandnetwork visualization genesetenrichment differentialexpression geneexpression microarray rnaseq genetics metabolomics proteomics systemsbiology sequencing

17.7 match 26 stars 6.23 score 22 scripts

patzaw

BED:Biological Entity Dictionary (BED)

An interface for the 'Neo4j' database providing mapping between different identifiers of biological entities. This Biological Entity Dictionary (BED) has been developed to address three main challenges. The first one is related to the completeness of identifier mappings. Indeed, direct mapping information provided by the different systems are not always complete and can be enriched by mappings provided by other resources. More interestingly, direct mappings not identified by any of these resources can be indirectly inferred by using mappings to a third reference. For example, many human Ensembl gene ID are not directly mapped to any Entrez gene ID but such mappings can be inferred using respective mappings to HGNC ID. The second challenge is related to the mapping of deprecated identifiers. Indeed, entity identifiers can change from one resource release to another. The identifier history is provided by some resources, such as Ensembl or the NCBI, but it is generally not used by mapping tools. The third challenge is related to the automation of the mapping process according to the relationships between the biological entities of interest. Indeed, mapping between gene and protein ID scopes should not be done the same way than between two scopes regarding gene ID. Also, converting identifiers from different organisms should be possible using gene orthologs information. The method has been published by Godard and van Eyll (2018) <doi:10.12688/f1000research.13925.3>.

Maintained by Patrice Godard. Last updated 3 months ago.

16.1 match 8 stars 6.85 score 25 scripts

bart1

move:Visualizing and Analyzing Animal Track Data

Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.

Maintained by Bart Kranstauber. Last updated 4 months ago.

cpp

12.6 match 8.74 score 690 scripts 3 dependents

aphalo

photobiologySun:Data for Sunlight Spectra

Data for the extraterrestrial solar spectral irradiance and ground level solar spectral irradiance and irradiance. In addition data for shade light under vegetation and irradiance time series from different broadband sensors. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 6 months ago.

21.7 match 5.03 score 27 scripts

dataoneorg

dataone:R Interface to the DataONE REST API

Provides read and write access to data and metadata from the DataONE network <https://www.dataone.org> of data repositories. Each DataONE repository implements a consistent repository application programming interface. Users call methods in R to access these remote repository functions, such as methods to query the metadata catalog, get access to metadata for particular data packages, and read the data objects from the data repository. Users can also insert and update data objects on repositories that support these methods.

Maintained by Matthew B. Jones. Last updated 3 years ago.

10.8 match 36 stars 9.93 score 472 scripts 3 dependents

kassambara

survminer:Drawing Survival Curves using 'ggplot2'

Contains the function 'ggsurvplot()' for drawing easily beautiful and 'ready-to-publish' survival curves with the 'number at risk' table and 'censoring count plot'. Other functions are also available to plot adjusted curves for `Cox` model and to visually examine 'Cox' model assumptions.

Maintained by Alboukadel Kassambara. Last updated 5 months ago.

6.7 match 524 stars 15.87 score 7.0k scripts 55 dependents

bioc

ensembldb:Utilities to create and use Ensembl-based annotation databases

The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.

Maintained by Johannes Rainer. Last updated 5 months ago.

genetics annotationdata sequencing coverage annotation bioconductor bioconductor-packages ensembl

7.5 match 35 stars 14.08 score 892 scripts 108 dependents

bioc

miRBaseConverter:A comprehensive and high-efficiency tool for converting and retrieving the information of miRNAs in different miRBase versions

A comprehensive tool for converting and retrieving the miRNA Name, Accession, Sequence, Version, History and Family information in different miRBase versions. It can process a huge number of miRNAs in a short time without other depends.

Maintained by Taosheng Xu Taosheng Xu. Last updated 5 months ago.

software mirna

16.1 match 1 stars 6.50 score 70 scripts

nlmixr2

rxode2:Facilities for Simulating from ODE-Based Models

Facilities for running simulations from ordinary differential equation ('ODE') models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the "R Administration and Installation" manual. Also the code is mostly released under GPL. The 'VODE' and 'LSODA' are in the public domain. The information is available in the inst/COPYRIGHTS.

Maintained by Matthew L. Fidler. Last updated 1 months ago.

fortran openblas cpp openmp

9.3 match 40 stars 11.24 score 220 scripts 13 dependents

bioc

Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data

The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.

Maintained by Matteo Tiberti. Last updated 2 months ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network survival genesetenrichment networkenrichment

15.7 match 5 stars 6.59 score 43 scripts

bioc

rcellminer:rcellminer: Molecular Profiles, Drug Response, and Chemical Structures for the NCI-60 Cell Lines

The NCI-60 cancer cell line panel has been used over the course of several decades as an anti-cancer drug screen. This panel was developed as part of the Developmental Therapeutics Program (DTP, http://dtp.nci.nih.gov/) of the U.S. National Cancer Institute (NCI). Thousands of compounds have been tested on the NCI-60, which have been extensively characterized by many platforms for gene and protein expression, copy number, mutation, and others (Reinhold, et al., 2012). The purpose of the CellMiner project (http://discover.nci.nih.gov/ cellminer) has been to integrate data from multiple platforms used to analyze the NCI-60 and to provide a powerful suite of tools for exploration of NCI-60 data.

Maintained by Augustin Luna. Last updated 5 months ago.

acgh cellbasedassays copynumbervariation geneexpression pharmacogenomics pharmacogenetics mirna cheminformatics visualization software systemsbiology

18.0 match 5.71 score 113 scripts

aphalo

photobiologyInOut:Read Spectral and Logged Data from Foreign Files

Functions for reading, and in some cases writing, foreign files containing spectral data from spectrometers and their associated software, output from daylight simulation models in common use, and some spectral data repositories. As well as functions for exchange of spectral data with other R packages. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 19 days ago.

16.2 match 6.32 score 63 scripts 1 dependents

crj32

Spectrum:Fast Adaptive Spectral Clustering for Single and Multi-View Data

A self-tuning spectral clustering method for single or multi-view data. 'Spectrum' uses a new type of adaptive density aware kernel that strengthens connections in the graph based on common nearest neighbours. It uses a tensor product graph data integration and diffusion procedure to integrate different data sources and reduce noise. 'Spectrum' uses either the eigengap or multimodality gap heuristics to determine the number of clusters. The method is sufficiently flexible so that a wide range of Gaussian and non-Gaussian structures can be clustered with automatic selection of K.

Maintained by Christopher R John. Last updated 5 years ago.

clustering spectral-clustering

17.1 match 7 stars 5.99 score 47 scripts 1 dependents

darwin-eu

DrugUtilisation:Summarise Patient-Level Drug Utilisation in Data Mapped to the OMOP Common Data Model

Summarise patient-level drug utilisation cohorts using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. New users and prevalent users cohorts can be generated and their characteristics, indication and drug use summarised.

Maintained by Martí Català. Last updated 2 months ago.

12.3 match 8.27 score 156 scripts 2 dependents

bioc

Rdisop:Decomposition of Isotopic Patterns

In high resolution mass spectrometry (HR-MS), the measured masses can be decomposed into potential element combinations (chemical sum formulas). Where additional mass/intensity information of respective isotopic peaks is available, decomposition can take this information into account to better rank the potential candidate sum formulas. To compare measured mass/intensity information with the theoretical distribution of candidate sum formulas, the latter needs to be calculated. This package implements fast algorithms to address both tasks, the calculation of isotopic distributions for arbitrary sum formulas (assuming a HR-MS resolution of roughly 30,000), and the ranked list of sum formulas fitting an observed peak or isotopic peak set.

Maintained by Steffen Neumann. Last updated 1 months ago.

immunooncology massspectrometry metabolomics mass-spectrometry cpp

11.1 match 4 stars 9.12 score 111 scripts 2 dependents

ropensci

taxize:Taxonomic Information from Around the Web

Interacts with a suite of web application programming interfaces (API) for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more. Some of the services supported include 'NCBI E-utilities' (<https://www.ncbi.nlm.nih.gov/books/NBK25501/>), 'Encyclopedia of Life' (<https://eol.org/docs/what-is-eol/data-services>), 'Global Biodiversity Information Facility' (<https://techdocs.gbif.org/en/openapi/>), and many more. Links to the API documentation for other supported services are available in the documentation for their respective functions in this package.

Maintained by Zachary Foster. Last updated 14 days ago.

taxonomy biology nomenclature json api web api-client identifiers species names api-wrapper biodiversity darwincore data taxize

7.4 match 274 stars 13.63 score 1.6k scripts 23 dependents

ropensci

biomartr:Genomic Data Retrieval

Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.

Maintained by Hajk-Georg Drost. Last updated 1 months ago.

biomart genomic-data-retrieval annotation-retrieval database-retrieval ncbi ensembl biological-data-retrieval ensembl-servers genome genome-annotation genome-retrieval genomics meta-analysis metagenomics ncbi-genbank peer-reviewed proteome sequenced-genomes

8.8 match 218 stars 11.35 score 129 scripts 3 dependents

bioc

coRdon:Codon Usage Analysis and Prediction of Gene Expressivity

Tool for analysis of codon usage in various unannotated or KEGG/COG annotated DNA sequences. Calculates different measures of CU bias and CU-based predictors of gene expressivity, and performs gene set enrichment analysis for annotated sequences. Implements several methods for visualization of CU and enrichment analysis results.

Maintained by Anamaria Elek. Last updated 5 months ago.

software metagenomics geneexpression genesetenrichment geneprediction visualization kegg pathways genetics cellbiology biomedicalinformatics immunooncology

12.8 match 20 stars 7.71 score 48 scripts 1 dependents

bioc

atSNP:Affinity test for identifying regulatory SNPs

atSNP performs affinity tests of motif matches with the SNP or the reference genomes and SNP-led changes in motif matches.

Maintained by Sunyoung Shin. Last updated 5 months ago.

software chipseq genomeannotation motifannotation visualization cpp

17.1 match 1 stars 5.73 score 36 scripts

rstudio

gt:Easily Create Presentation-Ready Display Tables

Build display tables from tabular data with an easy-to-use set of functions. With its progressive approach, we can construct display tables with a cohesive set of table parts. Table values can be formatted using any of the included formatting functions. Footnotes and cell styles can be precisely added through a location targeting system. The way in which 'gt' handles things for you means that you don't often have to worry about the fine details.

Maintained by Richard Iannone. Last updated 13 days ago.

docx easy-to-use html latex rtf summary-tables

5.3 match 2.1k stars 18.36 score 20k scripts 112 dependents

usdaforestservice

gdalraster:Bindings to the 'Geospatial Data Abstraction Library' Raster API

Interface to the Raster API of the 'Geospatial Data Abstraction Library' ('GDAL', <https://gdal.org>). Bindings are implemented in an exposed C++ class encapsulating a 'GDALDataset' and its raster band objects, along with several stand-alone functions. These support manual creation of uninitialized datasets, creation from existing raster as template, read/set dataset parameters, low level I/O, color tables, raster attribute tables, virtual raster (VRT), and 'gdalwarp' wrapper for reprojection and mosaicing. Includes 'GDAL' algorithms ('dem_proc()', 'polygonize()', 'rasterize()', etc.), and functions for coordinate transformation and spatial reference systems. Calling signatures resemble the native C, C++ and Python APIs provided by the 'GDAL' project. Includes raster 'calc()' to evaluate a given R expression on a layer or stack of layers, with pixel x/y available as variables in the expression; and raster 'combine()' to identify and count unique pixel combinations across multiple input layers, with optional output of the pixel-level combination IDs. Provides raster display using base 'graphics'. Bindings to a subset of the 'OGR' API are also included for managing vector data sources. Bindings to a subset of the Virtual Systems Interface ('VSI') are also included to support operations on 'GDAL' virtual file systems. These are general utility functions that abstract file system operations on URLs, cloud storage services, 'Zip'/'GZip'/'7z'/'RAR' archives, and in-memory files. 'gdalraster' may be useful in applications that need scalable, low-level I/O, or prefer a direct 'GDAL' API.

Maintained by Chris Toney. Last updated 21 hours ago.

gdal geospatial raster vector cpp

10.2 match 42 stars 9.51 score 32 scripts 3 dependents

sfcheung

manymome:Mediation, Moderation and Moderated-Mediation After Model Fitting

Computes indirect effects, conditional effects, and conditional indirect effects in a structural equation model or path model after model fitting, with no need to define any user parameters or label any paths in the model syntax, using the approach presented in Cheung and Cheung (2024) <doi:10.3758/s13428-023-02224-z>. Can also form bootstrap confidence intervals by doing bootstrapping only once and reusing the bootstrap estimates in all subsequent computations. Supports bootstrap confidence intervals for standardized (partially or completely) indirect effects, conditional effects, and conditional indirect effects as described in Cheung (2009) <doi:10.3758/BRM.41.2.425> and Cheung, Cheung, Lau, Hui, and Vong (2022) <doi:10.1037/hea0001188>. Model fitting can be done by structural equation modeling using lavaan() or regression using lm().

Maintained by Shu Fai Cheung. Last updated 25 days ago.

bootstrapping confidence-interval lavaan manymome mediation moderated-mediation moderation regression sem standardized-effect-size structural-equation-modeling

12.0 match 1 stars 8.06 score 172 scripts 4 dependents

insightsengineering

rtables:Reporting Tables

Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.

Maintained by Joe Zhu. Last updated 2 months ago.

pharmaceuticals tables

7.1 match 232 stars 13.65 score 238 scripts 17 dependents

bioc

DAPAR:Tools for the Differential Analysis of Proteins Abundance with R

The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).

Maintained by Samuel Wieczorek. Last updated 5 months ago.

proteomics normalization preprocessing massspectrometry qualitycontrol go dataimport prostar1

17.7 match 2 stars 5.42 score 22 scripts 1 dependents

ohdsi

FeatureExtraction:Generating Features for a Cohort

An R interface for generating features for a cohort using data in the Common Data Model. Features can be constructed using default or custom made feature definitions. Furthermore it's possible to aggregate features and get the summary statistics.

Maintained by Ger Inberg. Last updated 5 months ago.

hades openjdk

9.1 match 62 stars 10.55 score 209 scripts 2 dependents

framverse

TAMMsupport:Streamline working with Terminal Area Management Modules

A convenient tool for interfacing with Terminal Area Manamagement Modules (TAMMs) in R environments.

Maintained by Collin Edwards. Last updated 2 months ago.

fram

28.6 match 3 stars 3.35 score 5 scripts

bioc

scQTLtools:An R package for single-cell eQTL analysis and visualization

This package specializes in analyzing and visualizing eQTL at the single-cell level. It can read gene expression matrices or Seurat data, or SingleCellExperiment object along with genotype data. It offers a function for cis-eQTL analysis to detect eQTL within a given range, and another function to fit models with three methods. Using this package, users can also generate single-cell level visualization result.

Maintained by Xiaofeng Wu. Last updated 2 months ago.

software geneexpression geneticvariability snp differentialexpression genomicvariation variantdetection genetics functionalgenomics systemsbiology regression singlecell normalization visualization rna-seq sc-eqtl

19.3 match 3 stars 4.95 score

sdctools

sdcMicro:Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation

Data from statistical agencies and other institutions are mostly confidential. This package, introduced in Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v067.i04>, can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. The theoretical basis for the methods implemented can be found in Templ (2017) <doi:10.1007/978-3-319-50272-4>. Various risk estimation and anonymization methods are included. Note that the package includes a graphical user interface published in Meindl and Templ (2019) <doi:10.3390/a12090191> that allows to use various methods of this package.

Maintained by Matthias Templ. Last updated 29 days ago.

cpp

9.6 match 83 stars 9.89 score 258 scripts

ncss-tech

soilDB:Soil Database Interface

A collection of functions for reading soil data from U.S. Department of Agriculture Natural Resources Conservation Service (USDA-NRCS) and National Cooperative Soil Survey (NCSS) databases.

Maintained by Andrew Brown. Last updated 9 days ago.

kssl nasis nrcs soil soil-data-access soil-survey soilweb sql usda

8.4 match 87 stars 11.34 score 1.0k scripts 1 dependents

bioc

TFutils:TFutils

This package helps users to work with TF metadata from various sources. Significant catalogs of TFs and classifications thereof are made available. Tools for working with motif scans are also provided.

Maintained by Vincent Carey. Last updated 4 months ago.

transcriptomics

19.7 match 4.80 score 21 scripts

nlsy-links

NlsyLinks:Utilities and Kinship Information for Research with the NLSY

Utilities and kinship information for behavior genetics and developmental research using the National Longitudinal Survey of Youth (NLSY; <https://www.nlsinfo.org/>).

Maintained by S. Mason Garrison. Last updated 10 days ago.

behavior-genetics kinship-information national-longitudinal-survey nlsy

12.5 match 7 stars 7.49 score 185 scripts

stscl

sshicm:Information Consistency-Based Measures for Spatial Stratified Heterogeneity

Spatial stratified heterogeneity (SSH) denotes the coexistence of within-strata homogeneity and between-strata heterogeneity. Information consistency-based methods provide a rigorous approach to quantify SSH and evaluate its role in spatial processes, grounded in principles of geographical stratification and information theory (Bai, H. et al. (2023) <doi:10.1080/24694452.2023.2223700>; Wang, J. et al. (2024) <doi:10.1080/24694452.2023.2289982>).

Maintained by Wenbo Lv. Last updated 3 months ago.

geoinformatics geospatial-analysis information-theory spatial-statistics spatial-stratified-heterogeneity cpp

20.2 match 3 stars 4.65 score 2 scripts

bioc

xcms:LC-MS and GC-MS Data Analysis

Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.

Maintained by Steffen Neumann. Last updated 4 days ago.

immunooncology massspectrometry metabolomics bioconductor feature-detection mass-spectrometry peak-detection cpp

6.5 match 196 stars 14.31 score 984 scripts 11 dependents

doi-usgs

EGRET:Exploration and Graphics for RivEr Trends

Statistics and graphics for streamflow history, water quality trends, and the statistical modeling algorithm: Weighted Regressions on Time, Discharge, and Season (WRTDS).

Maintained by Laura DeCicco. Last updated 4 months ago.

usgs water-quality water-quality-data

8.7 match 90 stars 10.72 score 362 scripts 1 dependents

paul-buerkner

brms:Bayesian Regression Models using 'Stan'

Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.

Maintained by Paul-Christian Bürkner. Last updated 5 days ago.

bayesian-inference brms multilevel-models stan statistical-models

5.6 match 1.3k stars 16.61 score 13k scripts 34 dependents

overton-group

eHDPrep:Quality Control and Semantic Enrichment of Datasets

A tool for the preparation and enrichment of health datasets for analysis (Toner et al. (2023) <doi:10.1093/gigascience/giad030>). Provides functionality for assessing data quality and for improving the reliability and machine interpretability of a dataset. 'eHDPrep' also enables semantic enrichment of a dataset where metavariables are discovered from the relationships between input variables determined from user-provided ontologies.

Maintained by Ian Overton. Last updated 2 years ago.

data-quality health-informatics semantic-enrichment

19.0 match 8 stars 4.90 score 10 scripts

atlasoflivingaustralia

galah:Biodiversity Data from the GBIF Node Network

The Global Biodiversity Information Facility ('GBIF', <https://www.gbif.org>) sources data from an international network of data providers, known as 'nodes'. Several of these nodes - the "living atlases" (<https://living-atlases.gbif.org>) - maintain their own web services using software originally developed by the Atlas of Living Australia ('ALA', <https://www.ala.org.au>). 'galah' enables the R community to directly access data and resources hosted by 'GBIF' and its partner nodes.

Maintained by Martin Westgate. Last updated 1 months ago.

10.1 match 43 stars 9.17 score 275 scripts 1 dependents

njbultman

sleeperapi:Wrapper Functions Around 'Sleeper' (Fantasy Sports) API

For those wishing to interact with the 'Sleeper' (Fantasy Sports) API (<https://docs.sleeper.com/>) without looking too much into its documentation (found at <https://docs.sleeper.com/>), this package offers wrapper functions around the available API calls to make it easier.

Maintained by Nick Bultman. Last updated 4 months ago.

20.2 match 6 stars 4.59 score 8 scripts

heike

x3ptools:Tools for Working with 3D Surface Measurements

The x3p file format is specified in ISO standard 5436:2000 to describe 3d surface measurements. 'x3ptools' allows reading, writing and basic modifications to the 3D surface measurements.

Maintained by Heike Hofmann. Last updated 6 months ago.

12.4 match 8 stars 7.48 score 281 scripts 2 dependents

stscl

gdverse:Analysis of Spatial Stratified Heterogeneity

Analyzing spatial factors and exploring spatial associations based on the concept of spatial stratified heterogeneity, while also taking into account local spatial dependencies, spatial interpretability, complex spatial interactions, and robust spatial stratification. Additionally, it supports the spatial stratified heterogeneity family established in academic literature.

Maintained by Wenbo Lv. Last updated 4 days ago.

geographical-detector geoinformatics geospatial-analysis spatial-statistics spatial-stratified-heterogeneity cpp

10.2 match 32 stars 9.07 score 41 scripts 2 dependents

bioc

flowSpecs:Tools for processing of high-dimensional cytometry data

This package is intended to fill the role of conventional cytometry pre-processing software, for spectral decomposition, transformation, visualization and cleanup, and to aid further downstream analyses, such as with DepecheR, by enabling transformation of flowFrames and flowSets to dataframes. Functions for flowCore-compliant automatic 1D-gating/filtering are in the pipe line. The package name has been chosen both as it will deal with spectral cytometry and as it will hopefully give the user a nice pair of spectacles through which to view their data.

Maintained by Jakob Theorell. Last updated 5 months ago.

software cellbasedassays datarepresentation immunooncology flowcytometry singlecell visualization normalization dataimport

21.0 match 6 stars 4.38 score 7 scripts

mlr-org

mlr3filters:Filter Based Feature Selection for 'mlr3'

Extends 'mlr3' with filter methods for feature selection. Besides standalone filter methods built-in methods of any machine-learning algorithm are supported. Partial scoring of multivariate filter methods is supported.

Maintained by Marc Becker. Last updated 4 months ago.

feature-selection filter filters mlr mlr3 variable-importance

11.0 match 20 stars 8.37 score 95 scripts 3 dependents

posit-dev

connectapi:Utilities for Interacting with the 'Posit Connect' Server API

Provides a helpful 'R6' class and methods for interacting with the 'Posit Connect' Server API along with some meaningful utility functions for regular tasks. API documentation varies by 'Posit Connect' installation and version, but the latest documentation is also hosted publicly at <https://docs.posit.co/connect/api/>.

Maintained by Toph Allen. Last updated 5 days ago.

api-client rstudio-connect

8.8 match 47 stars 10.48 score 252 scripts 1 dependents

gorelab

waves:Vis-NIR Spectral Analysis Wrapper

Originally designed application in the context of resource-limited plant research and breeding programs, 'waves' provides an open-source solution to spectral data processing and model development by bringing useful packages together into a streamlined pipeline. This package is wrapper for functions related to the analysis of point visible and near-infrared reflectance measurements. It includes visualization, filtering, aggregation, preprocessing, cross-validation set formation, model training, and prediction functions to enable open-source association of spectral and reference data. This package is documented in a peer-reviewed manuscript in the Plant Phenome Journal <doi:10.1002/ppj2.20012>. Specialized cross-validation schemes are described in detail in Jarquín et al. (2017) <doi:10.3835/plantgenome2016.12.0130>. Example data is from Ikeogu et al. (2017) <doi:10.1371/journal.pone.0188918>.

Maintained by Jenna Hershberger. Last updated 11 months ago.

15.3 match 6 stars 5.98 score 53 scripts

ganglilab

geneset:Get Gene Sets for Gene Enrichment Analysis

Gene sets are fundamental for gene enrichment analysis. The package 'geneset' enables querying gene sets from public databases including 'GO' (Gene Ontology Consortium. (2004) <doi:10.1093/nar/gkh036>), 'KEGG' (Minoru et al. (2000) <doi:10.1093/nar/28.1.27>), 'WikiPathway' (Marvin et al. (2020) <doi:10.1093/nar/gkaa1024>), 'MsigDb' (Arthur et al. (2015) <doi:10.1016/j.cels.2015.12.004>), 'Reactome' (David et al. (2011) <doi:10.1093/nar/gkq1018>), 'MeSH' (Ish et al. (2014) <doi:10.4103/0019-5413.139827>), 'DisGeNET' (Janet et al. (2017) <doi:10.1093/nar/gkw943>), 'Disease Ontology' (Lynn et al. (2011) <doi:10.1093/nar/gkr972>), 'Network of Cancer Genes' (Dimitra et al. (2019) <doi:10.1186/s13059-018-1612-0>) and 'COVID-19' (Maxim et al. (2020) <doi:10.21203/rs.3.rs-28582/v1>). Gene sets are stored in the list object which provides data frame of 'geneset' and 'geneset_name'. The 'geneset' has two columns of term ID and gene ID. The 'geneset_name' has two columns of terms ID and term description.

Maintained by Yunze Liu. Last updated 2 years ago.

enrichment-analysis gene geneset-enrichment

19.3 match 9 stars 4.75 score 21 scripts 2 dependents

bioc

rhdf5:R Interface to HDF5

This package provides an interface between HDF5 and R. HDF5's main features are the ability to store and access very large and/or complex datasets and a wide variety of metadata on mass storage (disk) through a completely portable file format. The rhdf5 package is thus suited for the exchange of large and/or complex datasets between R and other software package, and for letting R applications work on datasets that are larger than the available RAM.

Maintained by Mike Smith. Last updated 2 months ago.

infrastructure dataimport hdf5 rhdf5 openssl curl zlib cpp

5.7 match 62 stars 15.93 score 4.2k scripts 232 dependents

yufree

enviGCMS:GC/LC-MS Data Analysis for Environmental Science

Gas/Liquid Chromatography-Mass Spectrometer(GC/LC-MS) Data Analysis for Environmental Science. This package covered topics such molecular isotope ratio, matrix effects and Short-Chain Chlorinated Paraffins analysis etc. in environmental analysis.

Maintained by Miao YU. Last updated 2 months ago.

environment mass-spectrometry metabolomics

13.9 match 17 stars 6.49 score 30 scripts 1 dependents

bioc

lumi:BeadArray Specific Methods for Illumina Methylation and Expression Microarrays

The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.

Maintained by Lei Huang. Last updated 5 months ago.

microarray onechannel preprocessing dnamethylation qualitycontrol twochannel

14.3 match 6.27 score 294 scripts 5 dependents

moosa-r

rbioapi:User-Friendly R Interface to Biologic Web Services' API

Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.

Maintained by Moosa Rezwani. Last updated 1 months ago.

api-client bioinformatics biology enrichment enrichment-analysis enrichr jaspar mieaa over-representation-analysis panther reactome string uniprot

11.8 match 20 stars 7.60 score 55 scripts

glenndavis52

colorSpec:Color Calculations with Emphasis on Spectral Data

Calculate with spectral properties of light sources, materials, cameras, eyes, and scanners. Build complex systems from simpler parts using a spectral product algebra. For light sources, compute CCT, CRI, SSI, and IES TM-30 reports. For object colors, compute optimal colors and Logvinenko coordinates. Work with the standard CIE illuminants and color matching functions, and read spectra from text files, including CGATS files. Estimate a spectrum from its response. A user guide and 9 vignettes are included.

Maintained by Glenn Davis. Last updated 1 months ago.

14.0 match 2 stars 6.34 score 73 scripts 5 dependents

stan-dev

rstanarm:Bayesian Applied Regression Modeling via Stan

Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.

Maintained by Ben Goodrich. Last updated 9 months ago.

bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics multilevel-models rstan rstanarm stan statistical-modeling cpp

5.7 match 393 stars 15.68 score 5.0k scripts 13 dependents

bioc

Pedixplorer:Pedigree Functions

Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.

Maintained by Louis Le Nezet. Last updated 3 days ago.

software datarepresentation genetics graphandnetwork visualization kinship pedigree

14.6 match 2 stars 6.08 score 10 scripts

bioc

metaMS:MS-based metabolomics annotation pipeline

MS-based metabolomics data processing and compound annotation pipeline.

Maintained by Yann Guitton. Last updated 5 months ago.

immunooncology massspectrometry metabolomics

11.8 match 15 stars 7.50 score 15 scripts

bioc

psichomics:Graphical Interface for Alternative Splicing Quantification, Analysis and Visualisation

Interactive R package with an intuitive Shiny-based graphical interface for alternative splicing quantification and integrative analyses of alternative splicing and gene expression based on The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression project (GTEx), Sequence Read Archive (SRA) and user-provided data. The tool interactively performs survival, dimensionality reduction and median- and variance-based differential splicing and gene expression analyses that benefit from the incorporation of clinical and molecular sample-associated features (such as tumour stage or survival). Interactive visual access to genomic mapping and functional annotation of selected alternative splicing events is also included.

Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.

sequencing rnaseq alternativesplicing differentialsplicing transcription gui principalcomponent survival biomedicalinformatics transcriptomics immunooncology visualization multiplecomparison geneexpression differentialexpression alternative-splicing bioconductor data-analyses differential-gene-expression differential-splicing-analysis gene-expression gtex recount2 rna-seq-data splicing-quantification sra tcga vast-tools cpp

12.7 match 36 stars 6.95 score 31 scripts

ropensci

GSODR:Global Surface Summary of the Day ('GSOD') Weather Data Client

Provides automated downloading, parsing, cleaning, unit conversion and formatting of Global Surface Summary of the Day ('GSOD') weather data from the from the USA National Centers for Environmental Information ('NCEI'). Units are converted from from United States Customary System ('USCS') units to International System of Units ('SI'). Stations may be individually checked for number of missing days defined by the user, where stations with too many missing observations are omitted. Only stations with valid reported latitude and longitude values are permitted in the final data. Additional useful elements, saturation vapour pressure ('es'), actual vapour pressure ('ea') and relative humidity ('RH') are calculated from the original data using the improved August-Roche-Magnus approximation (Alduchov & Eskridge 1996) and included in the final data set. The resulting metadata include station identification information, country, state, latitude, longitude, elevation, weather observations and associated flags. For information on the 'GSOD' data from 'NCEI', please see the 'GSOD' 'readme.txt' file available from, <https://www1.ncdc.noaa.gov/pub/data/gsod/readme.txt>.

Maintained by Adam H. Sparks. Last updated 13 days ago.

us-ncei meteorological-data global-weather weather weather-data meteorology station-data surface-weather data-access us-ncdc daily-data daily-weather global-data gsod historical-data historical-weather ncdc ncei weather-information weather-stations

10.1 match 94 stars 8.70 score 116 scripts

andrewhooker

PopED:Population (and Individual) Optimal Experimental Design

Optimal experimental designs for both population and individual studies based on nonlinear mixed-effect models. Often this is based on a computation of the Fisher Information Matrix. This package was developed for pharmacometric problems, and examples and predefined models are available for these types of systems. The methods are described in Nyberg et al. (2012) <doi:10.1016/j.cmpb.2012.05.005>, and Foracchia et al. (2004) <doi:10.1016/S0169-2607(03)00073-7>.

Maintained by Andrew C. Hooker. Last updated 5 months ago.

nlme optimal-design pharmacodynamics pharmacokinetics pharmacometrics pkpd population population-model

9.2 match 33 stars 9.58 score 300 scripts 1 dependents

jfieberg

SightabilityModel:Wildlife Sightability Modeling

Uses logistic regression to model the probability of detection as a function of covariates. This model is then used with observational survey data to estimate population size, while accounting for uncertain detection. See Steinhorst and Samuel (1989).

Maintained by Schwarz Carl James. Last updated 2 years ago.

17.5 match 1 stars 4.96 score 23 scripts

leonawicz

tabr:Music Notation Syntax, Manipulation, Analysis and Transcription in R

Provides a music notation syntax and a collection of music programming functions for generating, manipulating, organizing, and analyzing musical information in R. Music syntax can be entered directly in character strings, for example to quickly transcribe short pieces of music. The package contains functions for directly performing various mathematical, logical and organizational operations and musical transformations on special object classes that facilitate working with music data and notation. The same music data can be organized in tidy data frames for a familiar and powerful approach to the analysis of large amounts of structured music data. Functions are available for mapping seamlessly between these formats and their representations of musical information. The package also provides an API to 'LilyPond' (<https://lilypond.org/>) for transcribing musical representations in R into tablature ("tabs") and sheet music. 'LilyPond' is open source music engraving software for generating high quality sheet music based on markup syntax. The package generates 'LilyPond' files from R code and can pass them to the 'LilyPond' command line interface to be rendered into sheet music PDF files or inserted into R markdown documents. The package offers nominal MIDI file output support in conjunction with rendering sheet music. The package can read MIDI files and attempts to structure the MIDI data to integrate as best as possible with the data structures and functionality found throughout the package.

Maintained by Matthew Leonawicz. Last updated 6 months ago.

guitar-tablature lilypond lilypond-api music-analysis music-data music-notation music-programming music-syntax music-transcription sheet-music

11.0 match 132 stars 7.87 score 94 scripts

muschellij2

rscopus:Scopus Database 'API' Interface

Uses Elsevier 'Scopus' API <https://dev.elsevier.com/sc_apis.html> to download information about authors and their citations.

Maintained by John Muschelli. Last updated 1 years ago.

bibliometrics scopus scopus-api

9.3 match 77 stars 9.33 score 124 scripts 3 dependents

yulab-smu

TDbook:Companion Package for the Book "Data Integration, Manipulation and Visualization of Phylogenetic Trees" by Guangchuang Yu (2022, ISBN:9781032233574, doi:10.1201/9781003279242)

The companion package that provides all the datasets used in the book "Data Integration, Manipulation and Visualization of Phylogenetic Trees" by Guangchuang Yu (2022, ISBN:9781032233574, doi:10.1201/9781003279242).

Maintained by Guangchuang Yu. Last updated 3 years ago.

17.6 match 13 stars 4.88 score 59 scripts

rcppcore

Rcpp:Seamless R and C++ Integration

The 'Rcpp' package provides R functions as well as C++ classes which offer a seamless integration of R and C++. Many R data types and objects can be mapped back and forth to C++ equivalents which facilitates both writing of new code as well as easier integration of third-party libraries. Documentation about 'Rcpp' is provided by several vignettes included in this package, via the 'Rcpp Gallery' site at <https://gallery.rcpp.org>, the paper by Eddelbuettel and Francois (2011, <doi:10.18637/jss.v040.i08>), the book by Eddelbuettel (2013, <doi:10.1007/978-1-4614-6868-4>) and the paper by Eddelbuettel and Balamuta (2018, <doi:10.1080/00031305.2017.1375990>); see 'citation("Rcpp")' for details.

Maintained by Dirk Eddelbuettel. Last updated 3 days ago.

c-plus-plus c-plus-plus-11 c-plus-plus-14 c-plus-plus-17 c-plus-plus-20 rcpp cpp

3.8 match 753 stars 22.59 score 11k scripts 13k dependents

monty-se

PINstimation:Estimation of the Probability of Informed Trading

A comprehensive bundle of utilities for the estimation of probability of informed trading models: original PIN in Easley and O'Hara (1992) and Easley et al. (1996); Multilayer PIN (MPIN) in Ersan (2016); Adjusted PIN (AdjPIN) in Duarte and Young (2009); and volume-synchronized PIN (VPIN) in Easley et al. (2011, 2012). Implementations of various estimation methods suggested in the literature are included. Additional compelling features comprise posterior probabilities, an implementation of an expectation-maximization (EM) algorithm, and PIN decomposition into layers, and into bad/good components. Versatile data simulation tools, and trade classification algorithms are among the supplementary utilities. The package provides fast, compact, and precise utilities to tackle the sophisticated, error-prone, and time-consuming estimation procedure of informed trading, and this solely using the raw trade-level data.

Maintained by Montasser Ghachem. Last updated 5 months ago.

clustering-analysis expectation-maximisation-algorithm hierarchical-clustering information-asymmetry market-microstructure maximum-likelihood-estimation mixture-distributions poisson-distribution

13.1 match 36 stars 6.48 score 14 scripts

dovinij

GxEprs:Genotype-by-Environment Interaction in Polygenic Score Models

A novel PRS model is introduced to enhance the prediction accuracy by utilising GxE effects. This package performs Genome Wide Association Studies (GWAS) and Genome Wide Environment Interaction Studies (GWEIS) using a discovery dataset. The package has the ability to obtain polygenic risk scores (PRSs) for a target sample. Finally it predicts the risk values of each individual in the target sample. Users have the choice of using existing models (Li et al., 2015) <doi:10.1093/annonc/mdu565>, (Pandis et al., 2013) <doi:10.1093/ejo/cjt054>, (Peyrot et al., 2018) <doi:10.1016/j.biopsych.2017.09.009> and (Song et al., 2022) <doi:10.1038/s41467-022-32407-9>, as well as newly proposed models for genomic risk prediction (refer to the URL for more details).

Maintained by Dovini Jayasinghe. Last updated 10 months ago.

25.7 match 2 stars 3.30 score

ropensci

RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management

Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.

Maintained by Mathew W. McLean. Last updated 4 months ago.

peer-reviewed

7.0 match 115 stars 12.06 score 2.3k scripts 16 dependents

bioc

CluMSID:Clustering of MS2 Spectra for Metabolite Identification

CluMSID is a tool that aids the identification of features in untargeted LC-MS/MS analysis by the use of MS2 spectra similarity and unsupervised statistical methods. It offers functions for a complete and customisable workflow from raw data to visualisations and is interfaceable with the xmcs family of preprocessing packages.

Maintained by Tobias Depke. Last updated 5 months ago.

metabolomics preprocessing clustering

14.0 match 10 stars 6.04 score 22 scripts

ropensci

ritis:Integrated Taxonomic Information System Client

An interface to the Integrated Taxonomic Information System ('ITIS') (<https://www.itis.gov>). Includes functions to work with the 'ITIS' REST API methods (<https://www.itis.gov/ws_description.html>), as well as the 'Solr' web service (<https://www.itis.gov/solr_documentation.html>).

Maintained by Julia Blum. Last updated 1 months ago.

taxonomy biology nomenclature json api web api-client identifiers species names api-wrapper itis taxize

10.9 match 16 stars 7.72 score 64 scripts 24 dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure bioconductor-package core-package

5.9 match 12 stars 14.22 score 612 scripts 2.2k dependents

sbg

sevenbridges2:The 'Seven Bridges Platform' API Client

R client and utilities for 'Seven Bridges Platform' API, from 'Cancer Genomics Cloud' to other 'Seven Bridges' supported platforms. API documentation is hosted publicly at <https://docs.sevenbridges.com/docs/the-api>.

Maintained by Marko Trifunovic. Last updated 22 days ago.

api-client bioinformatics cloud sevenbridges

14.0 match 2 stars 5.90 score 4 scripts

billdenney

PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis

Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.

Maintained by Bill Denney. Last updated 18 days ago.

nca noncompartmental-analysis pharmacokinetics

6.6 match 73 stars 12.61 score 214 scripts 4 dependents

braverock

PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis

Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.

Maintained by Brian G. Peterson. Last updated 3 months ago.

5.2 match 222 stars 15.93 score 4.8k scripts 20 dependents

bioc

lipidr:Data Mining and Analysis of Lipidomics Datasets

lipidr an easy-to-use R package implementing a complete workflow for downstream analysis of targeted and untargeted lipidomics data. lipidomics results can be imported into lipidr as a numerical matrix or a Skyline export, allowing integration into current analysis frameworks. Data mining of lipidomics datasets is enabled through integration with Metabolomics Workbench API. lipidr allows data inspection, normalization, univariate and multivariate analysis, displaying informative visualizations. lipidr also implements a novel Lipid Set Enrichment Analysis (LSEA), harnessing molecular information such as lipid class, total chain length and unsaturation.

Maintained by Ahmed Mohamed. Last updated 5 months ago.

lipidomics massspectrometry normalization qualitycontrol visualization bioconductor

11.1 match 29 stars 7.44 score 40 scripts

difm-brain

ofpetrial:Design on-Farm Precision Field Agronomic Trials

A comprehensive system for designing and implementing on-farm precision field agronomic trials. You provide field data, tell 'ofpetrial' how to design a trial, and get readily-usable trial design files and a report checks the validity and reliability of the trial design.

Maintained by Taro Mieno. Last updated 3 months ago.

15.0 match 3 stars 5.49 score 23 scripts

lightbluetitan

ColombiAPI:Access Colombia's Public Data via API-Colombia

ColombiAPI provides a seamless interface to access diverse public data about Colombia through the API Colombia, a RESTful API. The package enables users to explore various aspects of Colombia, including general information, geography, and cultural insights. It includes five API-related functions to retrieve data on topics such as Colombia's general information, airports, departments, regions, and presidents. Additionally, ColombiAPI offers a built-in function to view the datasets available within the package. The package also includes curated datasets covering Bogota air stations, business and holiday dates, public schools, Colombian coffee exports, cannabis licenses, Medellin rainfall, and malls in Bogota, making it a comprehensive tool for exploring Colombia's data.

Maintained by Renzo Caceres Rossi. Last updated 2 months ago.

18.5 match 5 stars 4.40 score

bioc

Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery

A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.

Maintained by Nan Xiao. Last updated 5 months ago.

software dataimport datarepresentation featureextraction cheminformatics biomedicalinformatics proteomics go systemsbiology bioconductor bioinformatics drug-discovery feature-extraction fingerprint molecular-descriptors protein-sequences

10.4 match 37 stars 7.81 score 29 scripts

nikolaus77

rocker:Database Interface Class

'R6' class interface for handling relational database connections using 'DBI' package as backend. The class allows handling of connections to e.g. PostgreSQL, MariaDB and SQLite. The purpose is having an intuitive object allowing straightforward handling of SQL databases.

Maintained by Nikolaus Pawlowski. Last updated 3 years ago.

database dbi mariadb mysql postgres postgresql r6 sql sqlite

15.5 match 5 stars 5.24 score 7 scripts

bioc

ANCOMBC:Microbiome differential abudance and correlation analyses with bias correction

ANCOMBC is a package containing differential abundance (DA) and correlation analyses for microbiome data. Specifically, the package includes Analysis of Compositions of Microbiomes with Bias Correction 2 (ANCOM-BC2), Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC), and Analysis of Composition of Microbiomes (ANCOM) for DA analysis, and Sparse Estimation of Correlations among Microbiomes (SECOM) for correlation analysis. Microbiome data are typically subject to two sources of biases: unequal sampling fractions (sample-specific biases) and differential sequencing efficiencies (taxon-specific biases). Methodologies included in the ANCOMBC package are designed to correct these biases and construct statistically consistent estimators.

Maintained by Huang Lin. Last updated 2 days ago.

differentialexpression microbiome normalization sequencing software ancom ancombc ancombc2 correlation differential-abundance-analysis secom

7.5 match 120 stars 10.79 score 406 scripts 1 dependents

rstudio

shiny:Web Application Framework for R

Makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive prebuilt widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.

Maintained by Winston Chang. Last updated 15 days ago.

reactive rstudio shiny web-app web-development

3.8 match 5.4k stars 21.28 score 108k scripts 1.8k dependents

bioc

RCy3:Functions to Access and Control Cytoscape

Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.

Maintained by Alex Pico. Last updated 5 months ago.

visualization graphandnetwork thirdpartyclient network

6.0 match 52 stars 13.39 score 628 scripts 15 dependents

bioc

clippda:A package for the clinical proteomic profiling data analysis

Methods for the nalysis of data from clinical proteomic profiling studies. The focus is on the studies of human subjects, which are often observational case-control by design and have technical replicates. A method for sample size determination for planning these studies is proposed. It incorporates routines for adjusting for the expected heterogeneities and imbalances in the data and the within-sample replicate correlations.

Maintained by Stephen Nyangoma. Last updated 5 months ago.

proteomics onechannel preprocessing differentialexpression multiplecomparison

24.3 match 3.30 score 2 scripts

hannahcomiskey

mcmsupply:Estimating Public and Private Sector Contraceptive Market Supply Shares

Family Planning programs and initiatives typically use nationally representative surveys to estimate key indicators of a country’s family planning progress. However, in recent years, routinely collected family planning services data (Service Statistics) have been used as a supplementary data source to bridge gaps in the surveys. The use of service statistics comes with the caveat that adjustments need to be made for missing private sector contributions to the contraceptive method supply chain. Evaluating the supply source of modern contraceptives often relies on Demographic Health Surveys (DHS), where many countries do not have recent data beyond 2015/16. Fortunately, in the absence of recent surveys we can rely on statistical model-based estimates and projections to fill the knowledge gap. We present a Bayesian, hierarchical, penalized-spline model with multivariate-normal spline coefficients, to account for across method correlations, to produce country-specific,annual estimates for the proportion of modern contraceptive methods coming from the public and private sectors. This package provides a quick and convenient way for users to access the DHS modern contraceptive supply share data at national and subnational administration levels, estimate, evaluate and plot annual estimates with uncertainty for a sample of low- and middle-income countries. Methods for the estimation of method supply shares at the national level are described in Comiskey, Alkema, Cahill (2022) <arXiv:2212.03844>.

Maintained by Hannah Comiskey. Last updated 12 months ago.

jags cpp

15.6 match 2 stars 5.15 score 20 scripts