R-universe search: aml

dankelley

oce:Analysis of Oceanographic Data

Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.

Maintained by Dan Kelley. Last updated 4 days ago.

oceanography fortran cpp

7.3 match 146 stars 15.42 score 4.2k scripts 18 dependents

tamu-aml

DSWE:Data Science for Wind Energy

Data science methods used in wind energy applications. Current functionalities include creating a multi-dimensional power curve model, performing power curve function comparison, covariate matching, and energy decomposition. Relevant works for the developed functions are: funGP() - Prakash et al. (2022) <doi:10.1080/00401706.2021.1905073>, AMK() - Lee et al. (2015) <doi:10.1080/01621459.2014.977385>, tempGP() - Prakash et al. (2022) <doi:10.1080/00401706.2022.2069158>, ComparePCurve() - Ding et al. (2021) <doi:10.1016/j.renene.2021.02.136>, deltaEnergy() - Latiffianti et al. (2022) <doi:10.1002/we.2722>, syncSize() - Latiffianti et al. (2022) <doi:10.1002/we.2722>, imptPower() - Latiffianti et al. (2022) <doi:10.1002/we.2722>, All other functions - Ding (2019, ISBN:9780429956508).

Maintained by Yu Ding. Last updated 1 years ago.

openblas cpp

15.0 match 11 stars 4.22 score

xueyuancao

GSDA:Gene Set Distance Analysis (GSDA)

The gene-set distance analysis of omic data is implemented by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables.

Maintained by Xueyuan Cao. Last updated 4 years ago.

microarray bioinformatics gene expression

10.5 match 1 stars 4.30 score 8 scripts

eonurk

seAMLess:A Single Cell Transcriptomics Based Deconvolution Pipeline for Leukemia

Given a bulk transcriptomic (RNA-seq) sample of an Myeloid Leukemia patient calculates immune composition and drug resistance for different small-molecule inhibitors. Published in <https://www.nature.com/articles/s41698-024-00596-9>.

Maintained by E Onur Karakaslar. Last updated 3 months ago.

aml deconvolution venetoclax

11.0 match 2 stars 3.60 score 7 scripts

kelliejarcher

hdcuremodels:Penalized Mixture Cure Models for High-Dimensional Data

Provides functions for fitting various penalized parametric and semi-parametric mixture cure models with different penalty functions, testing for a significant cure fraction, and testing for sufficient follow-up as described in Fu et al (2022)<doi:10.1002/sim.9513> and Archer et al (2024)<doi:10.1186/s13045-024-01553-6>. False discovery rate controlled variable selection is provided using model-X knock-offs.

Maintained by Kellie J. Archer. Last updated 8 days ago.

8.0 match 4.40 score 5 scripts

julianfaraway

faraway:Datasets and Functions for Books by Julian Faraway

Books are "Linear Models with R" published 1st Ed. August 2004, 2nd Ed. July 2014, 3rd Ed. February 2025 by CRC press, ISBN 9781439887332, and "Extending the Linear Model with R" published by CRC press in 1st Ed. December 2005 and 2nd Ed. March 2016, ISBN 9781584884248 and "Practical Regression and ANOVA in R" contributed documentation on CRAN (now very dated).

Maintained by Julian Faraway. Last updated 1 months ago.

data

3.5 match 29 stars 9.43 score 1.7k scripts 1 dependents

mmaechler

supclust:Supervised Clustering of Predictor Variables Such as Genes

Methodology for supervised grouping aka "clustering" of potentially many predictor variables, such as genes etc, implementing algorithms 'PELORA' and 'WILMA'.

Maintained by Martin Maechler. Last updated 7 months ago.

openblas

7.0 match 2 stars 4.15 score 28 scripts

natydasilva

PPforest:Projection Pursuit Classification Forest

Implements projection pursuit forest algorithm for supervised classification.

Maintained by Natalia da Silva. Last updated 9 months ago.

openblas cpp

4.6 match 18 stars 5.53 score 19 scripts

uclouvain-cbio

scpdata:Single-Cell Proteomics Data Package

The package disseminates mass spectrometry (MS)-based single-cell proteomics (SCP) datasets. The data were collected from published work and formatted using the `scp` data structure. The data sets contain quantitative information at spectrum, peptide and/or protein level for single cells or minute sample amounts.

Maintained by Christophe Vanderaa. Last updated 12 days ago.

experimentdata expressiondata experimenthub reproducibleresearch massspectrometrydata proteome singlecelldata packagetypedata

3.3 match 6 stars 5.58 score 16 scripts

bioc

CMA:Synthesis of microarray-based classification

This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.

Maintained by Roman Hornung. Last updated 5 months ago.

classification decisiontree

3.5 match 5.09 score 61 scripts

bioc

Pigengene:Infers biological signatures from gene expression data

Pigengene package provides an efficient way to infer biological signatures from gene expression profiles. The signatures are independent from the underlying platform, e.g., the input can be microarray or RNA Seq data. It can even infer the signatures using data from one platform, and evaluate them on the other. Pigengene identifies the modules (clusters) of highly coexpressed genes using coexpression network analysis, summarizes the biological information of each module in an eigengene, learns a Bayesian network that models the probabilistic dependencies between modules, and builds a decision tree based on the expression of eigengenes.

Maintained by Habil Zare. Last updated 5 months ago.

geneexpression rnaseq networkinference network graphandnetwork biomedicalinformatics systemsbiology transcriptomics classification clustering decisiontree dimensionreduction principalcomponent microarray normalization immunooncology

3.8 match 4.56 score 10 scripts 1 dependents

sidiropoulos

sinaplot:An Enhanced Chart for Simple and Truthful Representation of Single Observations over Multiple Classes

The sinaplot is a data visualization chart suitable for plotting any single variable in a multiclass data set. It is an enhanced jitter strip chart, where the width of the jitter is controlled by the density distribution of the data within each class.

Maintained by Nikos Sidiropoulos. Last updated 8 years ago.

jitter plotting visualization

3.4 match 2 stars 4.76 score 57 scripts

romanhornung

prioritylasso:Analyzing Multiple Omics Data with an Offset Approach

Priority-LASSO (Klau et al., 2018) fits successive Lasso models for several blocks of (omics) data with different priorities and takes the predicted values as an offset for the next block. Also offers options to deal with block-wise missingness in multi-omics data. Reference: Klau, S., Jurinovic, V., Hornung, R., Herold, T., Boulesteix, A.-L. (2018) Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data. BMC Bioinformatics 19:322, <doi:10.1186/s12859-018-2344-6>.

Maintained by Roman Hornung. Last updated 2 years ago.

3.6 match 4.41 score 26 scripts

pbiecek

PBImisc:A Set of Datasets Used in My Classes or in the Book 'Modele Liniowe i Mieszane w R, Wraz z Przykladami w Analizie Danych'

A set of datasets and functions used in the book 'Modele liniowe i mieszane w R, wraz z przykladami w analizie danych'. Datasets either come from real studies or are created to be as similar as possible to real studies.

Maintained by Przemyslaw Biecek. Last updated 8 years ago.

3.6 match 4.00 score 66 scripts 1 dependents

alesmascaro

BCDAG:Bayesian Structure and Causal Learning of Gaussian Directed Graphs

A collection of functions for structure learning of causal networks and estimation of joint causal effects from observational Gaussian data. Main algorithm consists of a Markov chain Monte Carlo scheme for posterior inference of causal structures, parameters and causal effects between variables. References: F. Castelletti and A. Mascaro (2021) <doi:10.1007/s10260-021-00579-1>, F. Castelletti and A. Mascaro (2022) <doi:10.48550/arXiv.2201.12003>.

Maintained by Alessandro Mascaro. Last updated 20 days ago.

3.4 match 3 stars 4.11 score 17 scripts

bioc

CytoDx:Robust prediction of clinical outcomes using cytometry data without cell gating

This package provides functions that predict clinical outcomes using single cell data (such as flow cytometry data, RNA single cell sequencing data) without the requirement of cell gating or clustering.

Maintained by Zicheng Hu. Last updated 5 months ago.

immunooncology cellbiology flowcytometry statisticalmethod software cellbasedassays regression classification survival

3.5 match 4.00 score 8 scripts

bioc

netboost:Network Analysis Supported by Boosting

Boosting supported network analysis for high-dimensional omics applications. This package comes bundled with the MC-UPGMA clustering package by Yaniv Loewenstein.

Maintained by Pascal Schlosser. Last updated 5 months ago.

software statisticalmethod graphandnetwork network clustering dimensionreduction biomedicalinformatics epigenetics metabolomics transcriptomics cpp

3.3 match 4.18 score 1 scripts

kwstat

agridat:Agricultural Datasets

Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.

Maintained by Kevin Wright. Last updated 1 months ago.

data

1.3 match 126 stars 10.78 score 1.7k scripts 1 dependents

bioc

flowBin:Combining multitube flow cytometry data by binning

Software to combine flow cytometry data that has been multiplexed into multiple tubes with common markers between them, by establishing common bins across tubes in terms of the common markers, then determining expression within each tube for each bin in terms of the tube-specific markers.

Maintained by Kieran ONeill. Last updated 5 months ago.

immunooncology cellbasedassays flowcytometry

3.5 match 3.30 score 2 scripts

afukushima

DiffCorr:Analyzing and Visualizing Differential Correlation Networks in Biological Data

A method for identifying pattern changes between 2 experimental conditions in correlation networks (e.g., gene co-expression networks), which builds on a commonly used association measure, such as Pearson's correlation coefficient. This package includes functions to calculate correlation matrices for high-dimensional dataset and to test differential correlation, which means the changes in the correlation relationship among variables (e.g., genes and metabolites) between 2 experimental conditions.

Maintained by Atsushi Fukushima. Last updated 6 months ago.

1.2 match 5 stars 6.81 score 29 scripts 1 dependents

mattreusswig

reasonabletools:Clean Water Quality Data for NPDES Reasonable Potential Analyses

Functions for cleaning and summarising water quality data for use in National Pollutant Discharge Elimination Service (NPDES) permit reasonable potential analyses and water quality-based effluent limitation calculations. Procedures are based on those contained in the "Technical Support Document for Water Quality-based Toxics Control", United States Environmental Protection Agency (1991).

Maintained by Matthew Reusswig. Last updated 4 years ago.

censored-data npdes npdes-permit-development reasonable-potential-analysis water-quality

2.3 match 2.70 score 1 scripts

handcock

degreenet:Models for Skewed Count Distributions Relevant to Networks

Likelihood-based inference for skewed count distributions, typically of degrees used in network modeling. "degreenet" is a part of the "statnet" suite of packages for network analysis. See Jones and Handcock <doi:10.1098/rspb.2003.2369>.

Maintained by Mark S. Handcock. Last updated 6 months ago.

3.0 match 1 stars 1.75 score 28 scripts

marco-bee

FitDynMix:Estimation of Dynamic Mixtures

Estimation of a dynamic lognormal - Generalized Pareto mixture via the Approximate Maximum Likelihood and the Cross-Entropy methods. See Bee, M. (2023) <doi:10.1016/j.csda.2023.107764>.

Maintained by Marco Bee. Last updated 4 months ago.

1.8 match 2.70 score 3 scripts

cran

milorGWAS:Mixed Logistic Regression for Genome-Wide Analysis Studies (GWAS)

Fast approximate methods for mixed logistic regression in genome-wide analysis studies (GWAS). Two computationnally efficient methods are proposed for obtaining effect size estimates (beta) in Mixed Logistic Regression in GWAS: the Approximate Maximum Likelihood Estimate (AMLE), and the Offset method. The wald test obtained with AMLE is identical to the score test. Data can be genotype matrices in plink format, or dosage (VCF files). The methods are described in details in Milet et al (2020) <doi:10.1101/2020.01.17.910109>.

Maintained by Hervé Perdry. Last updated 9 months ago.

zlib cpp

0.8 match 2.00 score 8 scripts