R-universe search: needs:plotrix

tagteam

riskRegression:Risk Regression Models and Prediction Scores for Survival Analysis with Competing Risks

Implementation of the following methods for event history analysis. Risk regression models for survival endpoints also in the presence of competing risks are fitted using binomial regression based on a time sequence of binary event status variables. A formula interface for the Fine-Gray regression model and an interface for the combination of cause-specific Cox regression models. A toolbox for assessing and comparing performance of risk predictions (risk markers and risk prediction models). Prediction performance is measured by the Brier score and the area under the ROC curve for binary possibly time-dependent outcome. Inverse probability of censoring weighting and pseudo values are used to deal with right censored data. Lists of risk markers and lists of risk models are assessed simultaneously. Cross-validation repeatedly splits the data, trains the risk prediction models on one part of each split and then summarizes and compares the performance across splits.

Maintained by Thomas Alexander Gerds. Last updated 1 months ago.

openblas cpp

47 stars 13.07 score 736 scripts 37 dependents

bioc

ChIPseeker:ChIPseeker for ChIP peak Annotation, Comparison, and Visualization

This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation chipseq software visualization multiplecomparison atac-seq chip-seq comparison epigenetics epigenomics

233 stars 13.05 score 1.6k scripts 5 dependents

bioc

PharmacoGx:Analysis of Large-Scale Pharmacogenomic Data

Contains a set of functions to perform large-scale analysis of pharmaco-genomic data. These include the PharmacoSet object for storing the results of pharmacogenomic experiments, as well as a number of functions for computing common summaries of drug-dose response and correlating them with the molecular features in a cancer cell-line.

Maintained by Benjamin Haibe-Kains. Last updated 3 months ago.

geneexpression pharmacogenetics pharmacogenomics software classification datasets pharmacogenomic pharmacogx cpp

68 stars 11.39 score 442 scripts 3 dependents

fishr-core-team

FSA:Simple Fisheries Stock Assessment Methods

A variety of simple fish stock assessment methods.

Maintained by Derek H. Ogle. Last updated 2 months ago.

fish fisheries fisheries-management fisheries-stock-assessment population-dynamics stock-assessment

69 stars 11.16 score 1.7k scripts 6 dependents

bioc

genomation:Summary, annotation and visualization of genomic data

A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.

Maintained by Altuna Akalin. Last updated 5 months ago.

annotation sequencing visualization cpgisland cpp

76 stars 11.13 score 738 scripts 5 dependents

bioc

CATALYST:Cytometry dATa anALYSis Tools

CATALYST provides tools for preprocessing of and differential discovery in cytometry data such as FACS, CyTOF, and IMC. Preprocessing includes i) normalization using bead standards, ii) single-cell deconvolution, and iii) bead-based compensation. For differential discovery, the package provides a number of convenient functions for data processing (e.g., clustering, dimension reduction), as well as a suite of visualizations for exploratory data analysis and exploration of results from differential abundance (DA) and state (DS) analysis in order to identify differences in composition and expression profiles at the subpopulation-level, respectively.

Maintained by Helena L. Crowell. Last updated 4 months ago.

clustering dataimport differentialexpression experimentaldesign flowcytometry immunooncology massspectrometry normalization preprocessing singlecell software statisticalmethod visualization

67 stars 10.99 score 362 scripts 2 dependents

bioc

singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data

The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.

Maintained by Joshua David Campbell. Last updated 1 months ago.

singlecell geneexpression differentialexpression alignment clustering immunooncology batcheffect normalization qualitycontrol dataimport gui

182 stars 10.17 score 252 scripts

n8thangreen

BCEA:Bayesian Cost Effectiveness Analysis

Produces an economic evaluation of a sample of suitable variables of cost and effectiveness / utility for two or more interventions, e.g. from a Bayesian model in the form of MCMC simulations. This package computes the most cost-effective alternative and produces graphical summaries and probabilistic sensitivity analysis, see Baio et al (2017) <doi:10.1007/978-3-319-55718-2>.

Maintained by Gianluca Baio. Last updated 2 months ago.

bayesian cost-effectiveness

3 stars 9.90 score 243 scripts 3 dependents

pitakakariki

simr:Power Analysis for Generalised Linear Mixed Models by Simulation

Calculate power for generalised linear mixed models, using simulation. Designed to work with models fit using the 'lme4' package. Described in Green and MacLeod, 2016 <doi:10.1111/2041-210X.12504>.

Maintained by Peter Green. Last updated 2 years ago.

70 stars 9.82 score 756 scripts

trinker

qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis

Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.

Maintained by Tyler Rinker. Last updated 5 years ago.

qdap quantitative-discourse-analysis text-analysis text-mining text-plotting openjdk

176 stars 9.47 score 1.3k scripts 3 dependents

adibender

pammtools:Piece-Wise Exponential Additive Mixed Modeling Tools for Survival Analysis

The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated data as well as competing risks, recurrent events and multi-state settings. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.

Maintained by Andreas Bender. Last updated 9 days ago.

additive-models pamm pammtools piece-wise-exponential survival-analysis

48 stars 9.32 score 310 scripts 8 dependents

microsoft

finnts:Microsoft Finance Time Series Forecasting Framework

Automated time series forecasting developed by Microsoft Finance. The Microsoft Finance Time Series Forecasting Framework, aka Finn, can be used to forecast any component of the income statement, balance sheet, or any other area of interest by finance. Any numerical quantity over time, Finn can be used to forecast it. While it can be applied outside of the finance domain, Finn was built to meet the needs of financial analysts to better forecast their businesses within a company, and has a lot of built in features that are specific to the needs of financial forecasters. Happy forecasting!

Maintained by Mike Tokic. Last updated 1 months ago.

business data-science feature-selection finance finnts forecasting machine-learning microsoft time-series

194 stars 9.30 score 39 scripts

pecanproject

PEcAn.qaqc:QAQC

PEcAn integration and model skill testing

Maintained by David LeBauer. Last updated 1 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

216 stars 9.06 score 5 scripts

sachaepskamp

bootnet:Bootstrap Methods for Various Network Estimation Routines

Bootstrap methods to assess accuracy and stability of estimated network structures and centrality indices <doi:10.3758/s13428-017-0862-1>. Allows for flexible specification of any undirected network estimation procedure in R, and offers default sets for various estimation routines.

Maintained by Sacha Epskamp. Last updated 5 months ago.

32 stars 8.94 score 155 scripts 3 dependents

ericpante

marmap:Import, Plot and Analyze Bathymetric and Topographic Data

Import xyz data from the NOAA (National Oceanic and Atmospheric Administration, <https://www.noaa.gov>), GEBCO (General Bathymetric Chart of the Oceans, <https://www.gebco.net>) and other sources, plot xyz data to prepare publication-ready figures, analyze xyz data to extract transects, get depth / altitude based on geographical coordinates, or calculate z-constrained least-cost paths.

Maintained by Benoit Simon-Bouhet. Last updated 9 months ago.

32 stars 8.86 score 524 scripts 1 dependents

jinseob2kim

jsmodule:'RStudio' Addins and 'Shiny' Modules for Medical Research

'RStudio' addins and 'Shiny' modules for descriptive statistics, regression and survival analysis.

Maintained by Jinseob Kim. Last updated 15 days ago.

medical rstudio-addins shiny shiny-modules statistics

21 stars 8.69 score 61 scripts

mobiodiv

mobr:Measurement of Biodiversity

Functions for calculating metrics for the measurement biodiversity and its changes across scales, treatments, and gradients. The methods implemented in this package are described in: Chase, J.M., et al. (2018) <doi:10.1111/ele.13151>, McGlinn, D.J., et al. (2019) <doi:10.1111/2041-210X.13102>, McGlinn, D.J., et al. (2020) <doi:10.1101/851717>, and McGlinn, D.J., et al. (2023) <doi:10.1101/2023.09.19.558467>.

Maintained by Daniel McGlinn. Last updated 13 days ago.

biodiversity conservation ecology rarefaction species statistics

23 stars 8.65 score 93 scripts

marjoleinf

pre:Prediction Rule Ensembles

Derives prediction rule ensembles (PREs). Largely follows the procedure for deriving PREs as described in Friedman & Popescu (2008; <DOI:10.1214/07-AOAS148>), with adjustments and improvements. The main function pre() derives prediction rule ensembles consisting of rules and/or linear terms for continuous, binary, count, multinomial, and multivariate continuous responses. Function gpe() derives generalized prediction ensembles, consisting of rules, hinge and linear functions of the predictor variables.

Maintained by Marjolein Fokkema. Last updated 10 months ago.

58 stars 8.55 score 98 scripts 1 dependents

kornl

mutoss:Unified Multiple Testing Procedures

Designed to ease the application and comparison of multiple hypothesis testing procedures for FWER, gFWER, FDR and FDX. Methods are standardized and usable by the accompanying 'mutossGUI'.

Maintained by Kornelius Rohmeyer. Last updated 1 years ago.

4 stars 8.50 score 24 scripts 16 dependents

thej022214

hisse:Hidden State Speciation and Extinction

Sets up and executes a HiSSE model (Hidden State Speciation and Extinction) on a phylogeny and character sets to test for hidden shifts in trait dependent rates of diversification. Beaulieu and O'Meara (2016) <doi:10.1093/sysbio/syw022>.

Maintained by Jeremy Beaulieu. Last updated 2 months ago.

6 stars 8.45 score 152 scripts

stephenmilborrow

earth:Multivariate Adaptive Regression Splines

Build regression models using the techniques in Friedman's papers "Fast MARS" and "Multivariate Adaptive Regression Splines" <doi:10.1214/aos/1176347963>. (The term "MARS" is trademarked and thus not used in the name of the package.)

Maintained by Stephen Milborrow. Last updated 6 months ago.

fortran openblas

5 stars 8.40 score 3.9k scripts 26 dependents

modeloriented

survex:Explainable Machine Learning in Survival Analysis

Survival analysis models are commonly used in medicine and other areas. Many of them are too complex to be interpreted by human. Exploration and explanation is needed, but standard methods do not give a broad enough picture. 'survex' provides easy-to-apply methods for explaining survival models, both complex black-boxes and simpler statistical models. They include methods specific to survival analysis such as SurvSHAP(t) introduced in Krzyzinski et al., (2023) <doi:10.1016/j.knosys.2022.110234>, SurvLIME described in Kovalev et al., (2020) <doi:10.1016/j.knosys.2020.106164> as well as extensions of existing ones described in Biecek et al., (2021) <doi:10.1201/9780429027192>.

Maintained by Mikołaj Spytek. Last updated 10 months ago.

biostatistics brier-scores censored-data cox-model cox-regression explainable-ai explainable-machine-learning explainable-ml explanatory-model-analysis interpretable-machine-learning interpretable-ml machine-learning probabilistic-machine-learning shap survival-analysis time-to-event variable-importance xai

110 stars 8.40 score 114 scripts

muvisu

biplotEZ:EZ-to-Use Biplots

Provides users with an EZ-to-use platform for representing data with biplots. Currently principal component analysis (PCA), canonical variate analysis (CVA) and simple correspondence analysis (CA) biplots are included. This is accompanied by various formatting options for the samples and axes. Alpha-bags and concentration ellipses are included for visual enhancements and interpretation. For an extensive discussion on the topic, see Gower, J.C., Lubbe, S. and le Roux, N.J. (2011, ISBN: 978-0-470-01255-0) Understanding Biplots. Wiley: Chichester.

Maintained by Sugnet Lubbe. Last updated 23 days ago.

fortran

7 stars 8.39 score 30 scripts 1 dependents

bioc

POMA:Tools for Omics Data Analysis

The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.

Maintained by Pol Castellano-Escuder. Last updated 4 months ago.

batcheffect classification clustering decisiontree dimensionreduction multidimensionalscaling normalization preprocessing principalcomponent regression rnaseq software statisticalmethod visualization bioconductor bioinformatics data-visualization dimension-reduction exploratory-data-analysis machine-learning omics-data-integration pipeline pre-processing statistical-analysis user-friendly workflow

11 stars 8.16 score 20 scripts 1 dependents

deweyme

metap:Meta-Analysis of Significance Values

The canonical way to perform meta-analysis involves using effect sizes. When they are not available this package provides a number of methods for meta-analysis of significance values including the methods of Edgington, Fisher, Lancaster, Stouffer, Tippett, and Wilkinson; a number of data-sets to replicate published results; and routines for graphical display.

Maintained by Michael Dewey. Last updated 18 days ago.

8.08 score 642 scripts 14 dependents

bioc

STRINGdb:STRINGdb - Protein-Protein Interaction Networks and Functional Enrichment Analysis

The STRINGdb package provides a R interface to the STRING protein-protein interactions database (https://string-db.org).

Maintained by Damian Szklarczyk. Last updated 5 months ago.

network

8.08 score 344 scripts 9 dependents

kosukeimai

fastLink:Fast Probabilistic Record Linkage with Missing Data

Implements a Fellegi-Sunter probabilistic record linkage model that allows for missing data and the inclusion of auxiliary information. This includes functionalities to conduct a merge of two datasets under the Fellegi-Sunter model using the Expectation-Maximization algorithm. In addition, tools for preparing, adjusting, and summarizing data merges are included. The package implements methods described in Enamorado, Fifield, and Imai (2019) ''Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records'' <doi:10.1017/S0003055418000783> and is available at <https://imai.fas.harvard.edu/research/linkage.html>.

Maintained by Ted Enamorado. Last updated 1 years ago.

cpp openmp

279 stars 7.98 score 95 scripts 1 dependents

bioc

netZooR:Unified methods for the inference and analysis of gene regulatory networks

netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.

Maintained by Tara Eicher. Last updated 15 days ago.

networkinference network generegulation geneexpression transcription microarray graphandnetwork gene-regulatory-network transcription-factors

105 stars 7.98 score

pmair78

smacof:Multidimensional Scaling

Implements the following approaches for multidimensional scaling (MDS) based on stress minimization using majorization (smacof): ratio/interval/ordinal/spline MDS on symmetric dissimilarity matrices, MDS with external constraints on the configuration, individual differences scaling (idioscal, indscal), MDS with spherical restrictions, and ratio/interval/ordinal/spline unfolding (circular restrictions, row-conditional). Various tools and extensions like jackknife MDS, bootstrap MDS, permutation tests, MDS biplots, gravity models, unidimensional scaling, drift vectors (asymmetric MDS), classical scaling, and Procrustes are implemented as well.

Maintained by Patrick Mair. Last updated 6 months ago.

5 stars 7.86 score 152 scripts 24 dependents

trnnick

tsutils:Time Series Exploration, Modelling and Forecasting

Includes: (i) tests and visualisations that can help the modeller explore time series components and perform decomposition; (ii) modelling shortcuts, such as functions to construct lagmatrices and seasonal dummy variables of various forms; (iii) an implementation of the Theta method; (iv) tools to facilitate the design of the forecasting process, such as ABC-XYZ analyses; and (v) "quality of life" functions, such as treating time series for trailing and leading values.

Maintained by Nikolaos Kourentzes. Last updated 1 years ago.

11 stars 7.79 score 472 scripts 19 dependents

jimmcl

trajr:Animal Trajectory Analysis

A toolbox to assist with statistical analysis of animal trajectories. It provides simple access to algorithms for calculating and assessing a variety of characteristics such as speed and acceleration, as well as multiple measures of straightness or tortuosity. Some support is provided for 3-dimensional trajectories. McLean & Skowron Volponi (2018) <doi:10.1111/eth.12739>.

Maintained by Jim McLean. Last updated 8 months ago.

27 stars 7.69 score 151 scripts

mmollina

mappoly:Genetic Linkage Maps in Autopolyploids

Construction of genetic maps in autopolyploid full-sib populations. Uses pairwise recombination fraction estimation as the first source of information to sequentially position allelic variants in specific homologous chromosomes. For situations where pairwise analysis has limited power, the algorithm relies on the multilocus likelihood obtained through a hidden Markov model (HMM). For more detail, please see Mollinari and Garcia (2019) <doi:10.1534/g3.119.400378> and Mollinari et al. (2020) <doi:10.1534/g3.119.400620>.

Maintained by Marcelo Mollinari. Last updated 27 days ago.

polyploid polyploid-genetic-mapping polyploidy cpp

27 stars 7.56 score 111 scripts 1 dependents

bioc

EpiCompare:Comparison, Benchmarking & QC of Epigenomic Datasets

EpiCompare is used to compare and analyse epigenetic datasets for quality control and benchmarking purposes. The package outputs an HTML report consisting of three sections: (1. General metrics) Metrics on peaks (percentage of blacklisted and non-standard peaks, and peak widths) and fragments (duplication rate) of samples, (2. Peak overlap) Percentage and statistical significance of overlapping and non-overlapping peaks. Also includes upset plot and (3. Functional annotation) functional annotation (ChromHMM, ChIPseeker and enrichment analysis) of peaks. Also includes peak enrichment around TSS.

Maintained by Hiranyamaya Dash. Last updated 2 months ago.

epigenetics genetics qualitycontrol chipseq multiplecomparison functionalgenomics atacseq dnaseseq benchmark benchmarking bioconductor bioconductor-package comparison html interactive-reporting

15 stars 7.49 score 46 scripts

tagteam

pec:Prediction Error Curves for Risk Prediction Models in Survival Analysis

Validation of risk predictions obtained from survival models and competing risk models based on censored data using inverse weighting and cross-validation. Most of the 'pec' functionality has been moved to 'riskRegression'.

Maintained by Thomas A. Gerds. Last updated 2 years ago.

7.42 score 512 scripts 26 dependents

bioc

gDRutils:A package with helper functions for processing drug response data

This package contains utility functions used throughout the gDR platform to fit data, manipulate data, and convert and validate data structures. This package also has the necessary default constants for gDR platform. Many of the functions are utilized by the gDRcore package.

Maintained by Arkadiusz Gladki. Last updated 6 days ago.

software infrastructure

2 stars 7.42 score 3 scripts 3 dependents

kenaho1

asbio:A Collection of Statistical Tools for Biologists

Contains functions from: Aho, K. (2014) Foundational and Applied Statistics for Biologists using R. CRC/Taylor and Francis, Boca Raton, FL, ISBN: 978-1-4398-7338-0.

Maintained by Ken Aho. Last updated 3 months ago.

5 stars 7.32 score 310 scripts 3 dependents

bioc

gDRimport:Package for handling the import of dose-response data

The package is a part of the gDR suite. It helps to prepare raw drug response data for downstream processing. It mainly contains helper functions for importing/loading/validating dose-response data provided in different file formats.

Maintained by Arkadiusz Gladki. Last updated 6 days ago.

software infrastructure dataimport

3 stars 7.32 score 5 scripts 1 dependents

stephenmilborrow

plotmo:Plot a Model's Residuals, Response, and Partial Dependence Plots

Plot model surfaces for a wide variety of models using partial dependence plots and other techniques. Also plot model residuals and other information on the model.

Maintained by Stephen Milborrow. Last updated 7 months ago.

7.31 score 646 scripts 27 dependents

bioc

gDRcore:Processing functions and interface to process and analyze drug dose-response data

This package contains core functions to process and analyze drug response data. The package provides tools for normalizing, averaging, and calculation of gDR metrics data. All core functions are wrapped into the pipeline function allowing analyzing the data in a straightforward way.

Maintained by Arkadiusz Gladki. Last updated 2 days ago.

software shinyapps cpp

2 stars 7.25 score 4 scripts 1 dependents

cardiomoon

autoReg:Automatic Linear and Logistic Regression and Survival Analysis

Make summary tables for descriptive statistics and select explanatory variables automatically in various regression models. Support linear models, generalized linear models and cox-proportional hazard models. Generate publication-ready tables summarizing result of regression analysis and plots. The tables and plots can be exported in "HTML", "pdf('LaTex')", "docx('MS Word')" and "pptx('MS Powerpoint')" documents.

Maintained by Keon-Woong Moon. Last updated 1 years ago.

49 stars 7.13 score 69 scripts

trnnick

nnfor:Time Series Forecasting with Neural Networks

Automatic time series modelling with neural networks. Allows fully automatic, semi-manual or fully manual specification of networks. For details of the specification methodology see: (i) Crone and Kourentzes (2010) <doi:10.1016/j.neucom.2010.01.017>; and (ii) Kourentzes et al. (2014) <doi:10.1016/j.eswa.2013.12.011>.

Maintained by Nikolaos Kourentzes. Last updated 1 years ago.

32 stars 7.10 score 111 scripts 10 dependents

bioc

multiGSEA:Combining GSEA-based pathway enrichment with multi omics data integration

Extracted features from pathways derived from 8 different databases (KEGG, Reactome, Biocarta, etc.) can be used on transcriptomic, proteomic, and/or metabolomic level to calculate a combined GSEA-based enrichment score.

Maintained by Sebastian Canzler. Last updated 3 months ago.

genesetenrichment pathways reactome biocarta

18 stars 7.06 score 32 scripts

sylvainschmitt

SSDM:Stacked Species Distribution Modelling

Allows to map species richness and endemism based on stacked species distribution models (SSDM). Individuals SDMs can be created using a single or multiple algorithms (ensemble SDMs). For each species, an SDM can yield a habitat suitability map, a binary map, a between-algorithm variance map, and can assess variable importance, algorithm accuracy, and between- algorithm correlation. Methods to stack individual SDMs include summing individual probabilities and thresholding then summing. Thresholding can be based on a specific evaluation metric or by drawing repeatedly from a Bernoulli distribution. The SSDM package also provides a user-friendly interface.

Maintained by Sylvain Schmitt. Last updated 11 months ago.

44 stars 6.99 score 44 scripts

topepo

AppliedPredictiveModeling:Functions and Data Sets for 'Applied Predictive Modeling'

A few functions and several data set for the Springer book 'Applied Predictive Modeling'.

Maintained by Max Kuhn. Last updated 2 years ago.

37 stars 6.89 score 1.2k scripts

asgr

magicaxis:Pretty Scientific Plotting with Minor-Tick and Log Minor-Tick Support

Functions to make useful (and pretty) plots for scientific plotting. Additional plotting features are added for base plotting, with particular emphasis on making attractive log axis plots.

Maintained by Aaron Robotham. Last updated 6 months ago.

9 stars 6.84 score 184 scripts 7 dependents

paytonjjones

networktools:Tools for Identifying Important Nodes in Networks

Includes assorted tools for network analysis. Bridge centrality; goldbricker; MDS, PCA, & eigenmodel network plotting.

Maintained by Payton Jones. Last updated 1 months ago.

10 stars 6.75 score 93 scripts 5 dependents

r-forge

RHRV:Heart Rate Variability Analysis of ECG Data

Allows users to import data files containing heartbeat positions in the most broadly used formats, to remove outliers or points with unacceptable physiological values present in the time series, to plot HRV data, and to perform time domain, frequency domain and nonlinear HRV analysis. See Garcia et al. (2017) <DOI:10.1007/978-3-319-65355-6>.

Maintained by Leandro Rodriguez-Linares. Last updated 6 months ago.

6.68 score 63 scripts 1 dependents

khliland

multiblock:Multiblock Data Fusion in Statistics and Machine Learning

Functions and datasets to support Smilde, Næs and Liland (2021, ISBN: 978-1-119-60096-1) "Multiblock Data Fusion in Statistics and Machine Learning - Applications in the Natural and Life Sciences". This implements and imports a large collection of methods for multiblock data analysis with common interfaces, result- and plotting functions, several real data sets and six vignettes covering a range different applications.

Maintained by Kristian Hovde Liland. Last updated 1 days ago.

cpp

16 stars 6.56 score 19 scripts

bio-services

LinkageMapView:Plot Linkage Group Maps with Quantitative Trait Loci

Produces high resolution, publication ready linkage maps and quantitative trait loci maps. Input can be output from 'R/qtl', simple text or comma delimited files. Output is currently a portable document file.

Maintained by Steven Blanchard. Last updated 5 years ago.

9 stars 6.55 score 79 scripts

bioc

Xeva:Analysis of patient-derived xenograft (PDX) data

The Xeva package provides efficient and powerful functions for patient-drived xenograft (PDX) based pharmacogenomic data analysis. This package contains a set of functions to perform analysis of patient-derived xenograft data. This package was developed by the BHKLab, for further information please see our documentation.

Maintained by Benjamin Haibe-Kains. Last updated 13 days ago.

geneexpression pharmacogenetics pharmacogenomics software classification

11 stars 6.48 score 17 scripts

thongphamthe

PAFit:Generative Mechanism Estimation in Temporal Complex Networks

Statistical methods for estimating preferential attachment and node fitness generative mechanisms in temporal complex networks are provided. Thong Pham et al. (2015) <doi:10.1371/journal.pone.0137796>. Thong Pham et al. (2016) <doi:10.1038/srep32558>. Thong Pham et al. (2020) <doi:10.18637/jss.v092.i03>. Thong Pham et al. (2021) <doi:10.1093/comnet/cnab024>.

Maintained by Thong Pham. Last updated 1 years ago.

complex-networks fit-get-richer general-preferential-attachment minorize-maximization preferential-attachment rich-get-richer scale-free temporal-networks cpp openmp

17 stars 6.47 score 70 scripts

paulowhite

timeROC:Time-Dependent ROC Curve and AUC for Censored Survival Data

Estimation of time-dependent ROC curve and area under time dependent ROC curve (AUC) in the presence of censored data, with or without competing risks. Confidence intervals of AUCs and tests for comparing AUCs of two rival markers measured on the same subjects can be computed, using the iid-representation of the AUC estimator. Plot functions for time-dependent ROC curves and AUC curves are provided. Time-dependent Positive Predictive Values (PPV) and Negative Predictive Values (NPV) can also be computed. See Blanche et al. (2013) <doi:10.1002/sim.5958> and references therein for the details of the methods implemented in the package.

Maintained by Paul Blanche. Last updated 5 years ago.

9 stars 6.46 score 342 scripts 9 dependents

ocean-tracking-network

glatos:A package for the Great Lakes Acoustic Telemetry Observation System

Functions useful to members of the Great Lakes Acoustic Telemetry Observation System https://glatos.glos.us; many more broadly relevant to simulating, processing, analysing, and visualizing acoustic telemetry data.

Maintained by Christopher Holbrook. Last updated 7 months ago.

10 stars 6.38 score 112 scripts

zsteinmetz

envalysis:Miscellaneous Functions for Environmental Analyses

Small toolbox for data analyses in environmental chemistry and ecotoxicology. Provides, for example, calibration() to calculate calibration curves and corresponding limits of detection (LODs) and limits of quantification (LOQs) according to German DIN 32645 (2008). texture() makes it easy to estimate soil particle size distributions from hydrometer measurements (ASTM D422-63, 2007).

Maintained by Zacharias Steinmetz. Last updated 6 months ago.

analytics chemistry ecotoxicology environment soil

8 stars 6.30 score 83 scripts

cran

drc:Analysis of Dose-Response Curves

Analysis of dose-response data is made available through a suite of flexible and versatile model fitting and after-fitting functions.

Maintained by Christian Ritz. Last updated 9 years ago.

8 stars 6.25 score 28 dependents

federicogiorgi

corto:Inference of Gene Regulatory Networks

We present 'corto' (Correlation Tool), a simple package to infer gene regulatory networks and visualize master regulators from gene expression data using DPI (Data Processing Inequality) and bootstrapping to recover edges. An initial step is performed to calculate all significant edges between a list of source nodes (centroids) and target genes. Then all triplets containing two centroids and one target are tested in a DPI step which removes edges. A bootstrapping process then calculates the robustness of the network, eventually re-adding edges previously removed by DPI. The algorithm has been optimized to run outside a computing cluster, using a fast correlation implementation. The package finally provides functions to calculate network enrichment analysis from RNA-Seq and ATAC-Seq signatures as described in the article by Giorgi lab (2020) <doi:10.1093/bioinformatics/btaa223>.

Maintained by Federico M. Giorgi. Last updated 2 years ago.

20 stars 6.25 score 59 scripts

anthonydevaux

DynForest:Random Forest with Multivariate Longitudinal Predictors

Based on random forest principle, 'DynForest' is able to include multiple longitudinal predictors to provide individual predictions. Longitudinal predictors are modeled through the random forest. The methodology is fully described for a survival outcome in: Devaux, Helmer, Genuer & Proust-Lima (2023) <doi: 10.1177/09622802231206477>.

Maintained by Anthony Devaux. Last updated 5 months ago.

16 stars 6.20 score 8 scripts

fbertran

Patterns:Deciphering Biological Networks with Patterned Heterogeneous Measurements

A modeling tool dedicated to biological network modeling (Bertrand and others 2020, <doi:10.1093/bioinformatics/btaa855>). It allows for single or joint modeling of, for instance, genes and proteins. It starts with the selection of the actors that will be the used in the reverse engineering upcoming step. An actor can be included in that selection based on its differential measurement (for instance gene expression or protein abundance) or on its time course profile. Wrappers for actors clustering functions and cluster analysis are provided. It also allows reverse engineering of biological networks taking into account the observed time course patterns of the actors. Many inference functions are provided and dedicated to get specific features for the inferred network such as sparsity, robust links, high confidence links or stable through resampling links. Some simulation and prediction tools are also available for cascade networks (Jung and others 2014, <doi:10.1093/bioinformatics/btt705>). Example of use with microarray or RNA-Seq data are provided.

Maintained by Frederic Bertrand. Last updated 11 months ago.

4 stars 6.16 score 18 scripts

bioc

RCAS:RNA Centric Annotation System

RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.

Maintained by Bora Uyar. Last updated 5 months ago.

software genetarget motifannotation motifdiscovery go transcriptomics genomeannotation genesetenrichment coverage

6.14 score 29 scripts 1 dependents

bioc

esATAC:An Easy-to-use Systematic pipeline for ATACseq data analysis

This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw sequencing reads preprocessing (FASTQ files), reads alignment (Rbowtie2), aligned reads file operations (SAM, BAM, and BED files), peak calling (F-seq), genome annotations (Motif, GO, SNP analysis) and quality control report. The package is managed by dataflow graph. It is easy for user to pass variables seamlessly between processes and understand the workflow. Users can process FASTQ files through end-to-end preset pipeline which produces a pretty HTML report for quality control and preliminary statistical results, or customize workflow starting from any intermediate stages with esATAC functions easily and flexibly.

Maintained by Zheng Wei. Last updated 5 months ago.

immunooncology sequencing dnaseq qualitycontrol alignment preprocessing coverage atacseq dnaseseq atac-seq bioconductor pipeline cpp openjdk

23 stars 6.11 score 3 scripts

matthiaspucher

staRdom:PARAFAC Analysis of EEMs from DOM

'This is a user-friendly way to run a parallel factor (PARAFAC) analysis (Harshman, 1971) <doi:10.1121/1.1977523> on excitation emission matrix (EEM) data from dissolved organic matter (DOM) samples (Murphy et al., 2013) <doi:10.1039/c3ay41160e>. The analysis includes profound methods for model validation. Some additional functions allow the calculation of absorbance slope parameters and create beautiful plots.'

Maintained by Matthias Pucher. Last updated 5 months ago.

21 stars 6.03 score 86 scripts

nicwir

QurvE:Robust and User-Friendly Analysis of Growth and Fluorescence Curves

High-throughput analysis of growth curves and fluorescence data using three methods: linear regression, growth model fitting, and smooth spline fit. Analysis of dose-response relationships via smoothing splines or dose-response models. Complete data analysis workflows can be executed in a single step via user-friendly wrapper functions. The results of these workflows are summarized in detailed reports as well as intuitively navigable 'R' data containers. A 'shiny' application provides access to all features without requiring any programming knowledge. The package is described in further detail in Wirth et al. (2023) <doi:10.1038/s41596-023-00850-7>.

Maintained by Nicolas T. Wirth. Last updated 1 years ago.

25 stars 6.00 score 7 scripts

bozenne

BuyseTest:Generalized Pairwise Comparisons

Implementation of the Generalized Pairwise Comparisons (GPC) as defined in Buyse (2010) <doi:10.1002/sim.3923> for complete observations, and extended in Peron (2018) <doi:10.1177/0962280216658320> to deal with right-censoring. GPC compare two groups of observations (intervention vs. control group) regarding several prioritized endpoints to estimate the probability that a random observation drawn from one group performs better/worse/equivalently than a random observation drawn from the other group. Summary statistics such as the net treatment benefit, win ratio, or win odds are then deduced from these probabilities. Confidence intervals and p-values are obtained based on asymptotic results (Ozenne 2021 <doi:10.1177/09622802211037067>), non-parametric bootstrap, or permutations. The software enables the use of thresholds of minimal importance difference, stratification, non-prioritized endpoints (O Brien test), and can handle right-censoring and competing-risks.

Maintained by Brice Ozenne. Last updated 2 days ago.

generalized-pairwise-comparisons non-parametric statistics cpp

5 stars 5.95 score 90 scripts

fdetsch

Orcs:Omnidirectional R Code Snippets

I tend to repeat the same code chunks over and over again. At first, this was fine for me and I paid little attention to such redundancies. A little later, when I got tired of manually replacing Linux filepaths with the referring Windows versions, and vice versa, I started to stuff some very frequently used work-steps into functions and, even later, into a proper R package. And that's what this package is - a hodgepodge of various R functions meant to simplify (my) everyday-life coding work without, at the same time, being devoted to a particular scope of application.

Maintained by Florian Detsch. Last updated 2 years ago.

cpp

5 stars 5.87 score 98 scripts

bioc

omicsViewer:Interactive and explorative visualization of SummarizedExperssionSet or ExpressionSet using omicsViewer

omicsViewer visualizes ExpressionSet (or SummarizedExperiment) in an interactive way. The omicsViewer has a separate back- and front-end. In the back-end, users need to prepare an ExpressionSet that contains all the necessary information for the downstream data interpretation. Some extra requirements on the headers of phenotype data or feature data are imposed so that the provided information can be clearly recognized by the front-end, at the same time, keep a minimum modification on the existing ExpressionSet object. The pure dependency on R/Bioconductor guarantees maximum flexibility in the statistical analysis in the back-end. Once the ExpressionSet is prepared, it can be visualized using the front-end, implemented by shiny and plotly. Both features and samples could be selected from (data) tables or graphs (scatter plot/heatmap). Different types of analyses, such as enrichment analysis (using Bioconductor package fgsea or fisher's exact test) and STRING network analysis, will be performed on the fly and the results are visualized simultaneously. When a subset of samples and a phenotype variable is selected, a significance test on means (t-test or ranked based test; when phenotype variable is quantitative) or test of independence (chi-square or fisher’s exact test; when phenotype data is categorical) will be performed to test the association between the phenotype of interest with the selected samples. Additionally, other analyses can be easily added as extra shiny modules. Therefore, omicsViewer will greatly facilitate data exploration, many different hypotheses can be explored in a short time without the need for knowledge of R. In addition, the resulting data could be easily shared using a shiny server. Otherwise, a standalone version of omicsViewer together with designated omics data could be easily created by integrating it with portable R, which can be shared with collaborators or submitted as supplementary data together with a manuscript.

Maintained by Chen Meng. Last updated 2 months ago.

software visualization genesetenrichment differentialexpression motifdiscovery network networkenrichment

4 stars 5.82 score 22 scripts

rmarko

CORElearn:Classification, Regression and Feature Evaluation

A suite of machine learning algorithms written in C++ with the R interface contains several learning techniques for classification and regression. Predictive models include e.g., classification and regression trees with optional constructive induction and models in the leaves, random forests, kNN, naive Bayes, and locally weighted regression. All predictions obtained with these models can be explained and visualized with the 'ExplainPrediction' package. This package is especially strong in feature evaluation where it contains several variants of Relief algorithm and many impurity based attribute evaluation functions, e.g., Gini, information gain, MDL, and DKM. These methods can be used for feature selection or discretization of numeric attributes. The OrdEval algorithm and its visualization is used for evaluation of data sets with ordinal features and class, enabling analysis according to the Kano model of customer satisfaction. Several algorithms support parallel multithreaded execution via OpenMP. The top-level documentation is reachable through ?CORElearn.

Maintained by Marko Robnik-Sikonja. Last updated 5 months ago.

cpp openmp

3 stars 5.81 score 228 scripts 8 dependents

bioc

EGSEA:Ensemble of Gene Set Enrichment Analyses

This package implements the Ensemble of Gene Set Enrichment Analyses (EGSEA) method for gene set testing. EGSEA algorithm utilizes the analysis results of twelve prominent GSE algorithms in the literature to calculate collective significance scores for each gene set.

Maintained by Monther Alhamdoosh. Last updated 5 months ago.

immunooncology differentialexpression go geneexpression genesetenrichment genetics microarray multiplecomparison onechannel pathways rnaseq sequencing software systemsbiology twochannel metabolomics proteomics kegg graphandnetwork genesignaling genetarget networkenrichment network classification

5.81 score 64 scripts

elvanceyhan

pcds:Proximity Catch Digraphs and Their Applications

Contains the functions for construction and visualization of various families of the proximity catch digraphs (PCDs) (see (Ceyhan (2005) ISBN:978-3-639-19063-2), for computing the graph invariants for testing the patterns of segregation and association against complete spatial randomness (CSR) or uniformity in one, two and three dimensional cases. The package also has tools for generating points from these spatial patterns. The graph invariants used in testing spatial point data are the domination number (Ceyhan (2011) <doi:10.1080/03610921003597211>) and arc density (Ceyhan et al. (2006) <doi:10.1016/j.csda.2005.03.002>; Ceyhan et al. (2007) <doi:10.1002/cjs.5550350106>). The PCD families considered are Arc-Slice PCDs, Proportional-Edge PCDs, and Central Similarity PCDs.

Maintained by Elvan Ceyhan. Last updated 2 years ago.

5.80 score 21 scripts 2 dependents

bioc

GenomicPlot:Plot profiles of next generation sequencing data in genomic features

Visualization of next generation sequencing (NGS) data is essential for interpreting high-throughput genomics experiment results. 'GenomicPlot' facilitates plotting of NGS data in various formats (bam, bed, wig and bigwig); both coverage and enrichment over input can be computed and displayed with respect to genomic features (such as UTR, CDS, enhancer), and user defined genomic loci or regions. Statistical tests on signal intensity within user defined regions of interest can be performed and represented as boxplots or bar graphs. Parallel processing is used to speed up computation on multicore platforms. In addition to genomic plots which is suitable for displaying of coverage of genomic DNA (such as ChIPseq data), metagenomic (without introns) plots can also be made for RNAseq or CLIPseq data as well.

Maintained by Shuye Pu. Last updated 2 months ago.

alternativesplicing chipseq coverage geneexpression rnaseq sequencing software transcription visualization annotation

5 stars 5.78 score 4 scripts

bioc

BPRMeth:Model higher-order methylation profiles

The BPRMeth package is a probabilistic method to quantify explicit features of methylation profiles, in a way that would make it easier to formally use such profiles in downstream modelling efforts, such as predicting gene expression levels or clustering genomic regions or cells according to their methylation profiles.

Maintained by Chantriolnt-Andreas Kapourani. Last updated 5 months ago.

immunooncology dnamethylation geneexpression generegulation epigenetics genetics clustering featureextraction regression rnaseq bayesian kegg sequencing coverage singlecell openblas cpp

5.75 score 94 scripts 1 dependents

isobelbarrott

Landmarking:Analysis using Landmark Models

The landmark approach allows survival predictions to be updated dynamically as new measurements from an individual are recorded. The idea is to set predefined time points, known as "landmark times", and form a model at each landmark time using only the individuals in the risk set. This package allows the longitudinal data to be modelled either using the last observation carried forward or linear mixed effects modelling. There is also the option to model competing risks, either through cause-specific Cox regression or Fine-Gray regression. To find out more about the methods in this package, please see <https://isobelbarrott.github.io/Landmarking/articles/Landmarking>.

Maintained by Isobel Barrott. Last updated 2 years ago.

6 stars 5.72 score 44 scripts

stla

gyro:Hyperbolic Geometry

Hyperbolic geometry in the Minkowski model and the Poincaré model. The methods are based on the gyrovector space theory developed by A. A. Ungar that can be found in the book 'Analytic Hyperbolic Geometry: Mathematical Foundations And Applications' <doi:10.1142/5914>. The package provides functions to plot three-dimensional hyperbolic polyhedra and to plot hyperbolic tilings of the Poincaré disk.

Maintained by Stéphane Laurent. Last updated 1 years ago.

geometry hyperbolic-geometry rgl cpp

4 stars 5.69 score 81 scripts 1 dependents

mattmar

dynamAedes:A Unified Mechanistic Model for the Population Dynamics of Invasive Aedes Mosquitoes

Generalised model for population dynamics of invasive Aedes mosquitoes. Rationale and model structure are described here: Da Re et al. (2021) <doi:10.1016/j.ecoinf.2020.101180> and Da Re et al. (2022) <doi:10.1101/2021.12.21.473628>.

Maintained by Matteo Marcantonio. Last updated 1 years ago.

ecology invasive-species modelling mosquitoes pathogens

7 stars 5.59 score 11 scripts

tvganesh

cricketr:Analyze Cricketers and Cricket Teams Based on ESPN Cricinfo Statsguru

Tools for analyzing performances of cricketers based on stats in ESPN Cricinfo Statsguru. The toolset can be used for analysis of Tests,ODIs and Twenty20 matches of both batsmen and bowlers. The package can also be used to analyze team performances.

Maintained by Tinniam V Ganesh. Last updated 4 years ago.

62 stars 5.55 score 115 scripts

robindenz1

contsurvplot:Visualize the Effect of a Continuous Variable on a Time-to-Event Outcome

Graphically display the (causal) effect of a continuous variable on a time-to-event outcome using multiple different types of plots based on g-computation. Those functions include, among others, survival area plots, survival contour plots, survival quantile plots and 3D surface plots. Due to the use of g-computation, all plot allow confounder-adjustment naturally. For details, see Robin Denz, Nina Timmesfeld (2023) <doi:10.1097/EDE.0000000000001630>.

Maintained by Robin Denz. Last updated 23 hours ago.

causal-inference continuous g-computation survival-analysis visualization

12 stars 5.53 score 56 scripts

bioc

miRSM:Inferring miRNA sponge modules in heterogeneous data

The package aims to identify miRNA sponge or ceRNA modules in heterogeneous data. It provides several functions to study miRNA sponge modules at single-sample and multi-sample levels, including popular methods for inferring gene modules (candidate miRNA sponge or ceRNA modules), and two functions to identify miRNA sponge modules at single-sample and multi-sample levels, as well as several functions to conduct modular analysis of miRNA sponge modules.

Maintained by Junpeng Zhang. Last updated 5 months ago.

geneexpression biomedicalinformatics clustering genesetenrichment microarray software generegulation genetarget cerna mirna mirna-sponge mirna-targets modules openjdk

4 stars 5.51 score 5 scripts

bioc

synergyfinder:Calculate and Visualize Synergy Scores for Drug Combinations

Efficient implementations for analyzing pre-clinical multiple drug combination datasets. It provides efficient implementations for 1.the popular synergy scoring models, including HSA, Loewe, Bliss, and ZIP to quantify the degree of drug combination synergy; 2. higher order drug combination data analysis and synergy landscape visualization for unlimited number of drugs in a combination; 3. statistical analysis of drug combination synergy and sensitivity with confidence intervals and p-values; 4. synergy barometer for harmonizing multiple synergy scoring methods to provide a consensus metric of synergy; 5. evaluation of synergy and sensitivity simultaneously to provide an unbiased interpretation of the clinical potential of the drug combinations. Based on this package, we also provide a web application (http://www.synergyfinder.org) for users who prefer graphical user interface.

Maintained by Shuyu Zheng. Last updated 5 months ago.

software statisticalmethod

5.42 score 44 scripts

bioc

RITAN:Rapid Integration of Term Annotation and Network resources

Tools for comprehensive gene set enrichment and extraction of multi-resource high confidence subnetworks. RITAN facilitates bioinformatic tasks for enabling network biology research.

Maintained by Michael Zimmermann. Last updated 5 months ago.

qualitycontrol network networkenrichment networkinference genesetenrichment functionalgenomics graphandnetwork

5.40 score 9 scripts

angabrio

missingHE:Missing Outcome Data in Health Economic Evaluation

Contains a suite of functions for health economic evaluations with missing outcome data. The package can fit different types of statistical models under a fully Bayesian approach using the software 'JAGS' (which should be installed locally and which is loaded in 'missingHE' via the 'R' package 'R2jags'). Three classes of models can be fitted under a variety of missing data assumptions: selection models, pattern mixture models and hurdle models. In addition to model fitting, 'missingHE' provides a set of specialised functions to assess model convergence and fit, and to summarise the statistical and economic results using different types of measures and graphs. The methods implemented are described in Mason (2018) <doi:10.1002/hec.3793>, Molenberghs (2000) <doi:10.1007/978-1-4419-0300-6_18> and Gabrio (2019) <doi:10.1002/sim.8045>.

Maintained by Andrea Gabrio. Last updated 2 years ago.

cost-effectiveness-analysis health-economic-evaluation individual-level-data jags missing-data parametric-modelling sensitivity-analysis cpp

5 stars 5.38 score 24 scripts

bioc

GeDi:Defining and visualizing the distances between different genesets

The package provides different distances measurements to calculate the difference between genesets. Based on these scores the genesets are clustered and visualized as graph. This is all presented in an interactive Shiny application for easy usage.

Maintained by Annekathrin Nedwed. Last updated 5 months ago.

gui genesetenrichment software transcription rnaseq visualization clustering pathways reportwriting go kegg reactome shinyapps

1 stars 5.36 score 22 scripts

mspinillos

ecoregime:Analysis of Ecological Dynamic Regimes

A toolbox for implementing the Ecological Dynamic Regime framework (Sánchez-Pinillos et al., 2023 <doi:10.1002/ecm.1589>) to characterize and compare groups of ecological trajectories in multidimensional spaces defined by state variables. The package includes the RETRA-EDR algorithm to identify representative trajectories, functions to generate, summarize, and visualize representative trajectories, and several metrics to quantify the distribution and heterogeneity of trajectories in an ecological dynamic regime and quantify the dissimilarity between two or more ecological dynamic regimes. The package also includes a set of functions to assess ecological resilience based on ecological dynamic regimes (Sánchez-Pinillos et al., 2024 <doi:10.1016/j.biocon.2023.110409>).

Maintained by Martina Sánchez-Pinillos. Last updated 12 months ago.

7 stars 5.32 score 8 scripts

f-silva-archaeo

skyscapeR:Data Analysis and Visualization for Skyscape Archaeology

Data reduction, visualization and statistical analysis of measurements of orientation of archaeological structures, following Silva (2020) <doi:10.1016/j.jas.2020.105138>.

Maintained by Silva Fabio. Last updated 6 months ago.

5 stars 5.31 score 41 scripts

cshs-hydrology

CSHShydRology:Canadian Hydrological Analyses

A collection of user submitted functions to aid in the analysis of hydrological data.

Maintained by Kevin Shook. Last updated 3 years ago.

4 stars 5.26 score 23 scripts

bioc

epistack:Heatmaps of Stack Profiles from Epigenetic Signals

The epistack package main objective is the visualizations of stacks of genomic tracks (such as, but not restricted to, ChIP-seq, ATAC-seq, DNA methyation or genomic conservation data) centered at genomic regions of interest. epistack needs three different inputs: 1) a genomic score objects, such as ChIP-seq coverage or DNA methylation values, provided as a `GRanges` (easily obtained from `bigwig` or `bam` files). 2) a list of feature of interest, such as peaks or transcription start sites, provided as a `GRanges` (easily obtained from `gtf` or `bed` files). 3) a score to sort the features, such as peak height or gene expression value.

Maintained by DEVAILLY Guillaume. Last updated 5 months ago.

rnaseq preprocessing chipseq geneexpression coverage bioinformatics

6 stars 5.26 score 5 scripts

bioc

heatmaps:Flexible Heatmaps for Functional Genomics and Sequence Features

This package provides functions for plotting heatmaps of genome-wide data across genomic intervals, such as ChIP-seq signals at peaks or across promoters. Many functions are also provided for investigating sequence features.

Maintained by Malcolm Perry. Last updated 5 months ago.

visualization sequencematching functionalgenomics

5.23 score 19 scripts 1 dependents

bioc

CytoMDS:Low Dimensions projection of cytometry samples

This package implements a low dimensional visualization of a set of cytometry samples, in order to visually assess the 'distances' between them. This, in turn, can greatly help the user to identify quality issues like batch effects or outlier samples, and/or check the presence of potential sample clusters that might align with the exeprimental design. The CytoMDS algorithm combines, on the one hand, the concept of Earth Mover's Distance (EMD), a.k.a. Wasserstein metric and, on the other hand, the Multi Dimensional Scaling (MDS) algorithm for the low dimensional projection. Also, the package provides some diagnostic tools for both checking the quality of the MDS projection, as well as tools to help with the interpretation of the axes of the projection.

Maintained by Philippe Hauchamps. Last updated 2 months ago.

flowcytometry qualitycontrol dimensionreduction multidimensionalscaling software visualization

1 stars 5.23 score 2 scripts

bioc

gDR:Umbrella package for R packages in the gDR suite

Package is a part of the gDR suite. It reexports functions from other packages in the gDR suite that contain critical processing functions and utilities. The vignette walks through the full processing pipeline for drug response analyses that the gDR suite offers.

Maintained by Arkadiusz Gladki. Last updated 5 months ago.

software dataimport shinyapps

1 stars 5.20 score 7 scripts

lydialucchesi

smallsets:Visual Documentation for Data Preprocessing

Data practitioners regularly use the 'R' and 'Python' programming languages to prepare data for analyses. Thus, they encode important data preprocessing decisions in 'R' and 'Python' code. The 'smallsets' package subsequently decodes these decisions into a Smallset Timeline, a static, compact visualisation of data preprocessing decisions (Lucchesi et al. (2022) <doi:10.1145/3531146.3533175>). The visualisation consists of small data snapshots of different preprocessing steps. The 'smallsets' package builds this visualisation from a user's dataset and preprocessing code located in an 'R', 'R Markdown', 'Python', or 'Jupyter Notebook' file. Users simply add structured comments with snapshot instructions to the preprocessing code. One optional feature in 'smallsets' requires installation of the 'Gurobi' optimisation software and 'gurobi' 'R' package, available from <https://www.gurobi.com>. More information regarding the optional feature and 'gurobi' installation can be found in the 'smallsets' vignette.

Maintained by Lydia R. Lucchesi. Last updated 2 months ago.

data-science data-visualization documentation-tool machine-learning preprocessing python visualization-tools

14 stars 5.19 score 11 scripts

gabrielgesteira

qtlpoly:Random-Effect Multiple QTL Mapping in Autopolyploids

Performs random-effect multiple interval mapping (REMIM) in full-sib families of autopolyploid species based on restricted maximum likelihood (REML) estimation and score statistics, as described in Pereira et al. (2020) <doi:10.1534/genetics.120.303080>.

Maintained by Gabriel de Siqueira Gesteira. Last updated 5 months ago.

polyploid qtl-mapping openblas cpp openmp

6 stars 5.17 score 61 scripts

bioc

geomeTriD:A R/Bioconductor package for interactive 3D plot of epigenetic data or single cell data

geomeTriD (Three Dimensional Geometry Package) create interactive 3D plots using the GL library with the 'three.js' visualization library (https://threejs.org) or the rgl library. In addition to creating interactive 3D plots, the application also generates simplified models in 2D. These 2D models provide a more straightforward visual representation, making it easier to analyze and interpret the data quickly. This functionality ensures that users have access to both detailed three-dimensional visualizations and more accessible two-dimensional views, catering to various analytical needs.

Maintained by Jianhong Ou. Last updated 2 months ago.

visualization

1 stars 5.10 score 7 scripts

bioc

IMMAN:Interlog protein network reconstruction by Mapping and Mining ANalysis

Reconstructing Interlog Protein Network (IPN) integrated from several Protein protein Interaction Networks (PPINs). Using this package, overlaying different PPINs to mine conserved common networks between diverse species will be applicable.

Maintained by Minoo Ashtiani. Last updated 5 months ago.

sequencematching alignment systemsbiology graphandnetwork network proteomics

5.08 score 3 scripts

vandenman

NetworkComparisonTest:Statistical Comparison of Two Networks Based on Several Invariance Measures

This permutation based hypothesis test, suited for several types of data supported by the estimateNetwork function of the bootnet package (Epskamp & Fried, 2018), assesses the difference between two networks based on several invariance measures (network structure invariance, global strength invariance, edge invariance, several centrality measures, etc.). Network structures are estimated with l1-regularization. The Network Comparison Test is suited for comparison of independent (e.g., two different groups) and dependent samples (e.g., one group that is measured twice). See van Borkulo et al. (2021, in press; the final article will be available, upon publication, via its DOI: 10.1037/met0000476).

Maintained by Claudia van Borkulo. Last updated 3 years ago.

5.07 score 70 scripts

ustervbo

beadplexr:Analysis of Multiplex Cytometric Bead Assays

Reproducible and automated analysis of multiplex bead assays such as CBA (Morgan et al. 2004; <doi: 10.1016/j.clim.2003.11.017>), LEGENDplex (Yu et al. 2015; <doi: 10.1084/jem.20142318>), and MACSPlex (Miltenyi Biotec 2014; Application note: Data acquisition and analysis without the MACSQuant analyzer; <https://www.miltenyibiotec.com/upload/assets/IM0021608.PDF>). The package provides functions for streamlined reading of fcs files, and identification of bead clusters and analyte expression. The package eases the calculation of standard curves and the subsequent calculation of the analyte concentration.

Maintained by Ulrik Stervbo. Last updated 2 years ago.

5.07 score 39 scripts

andreamrau

HTSCluster:Clustering High-Throughput Transcriptome Sequencing (HTS) Data

A Poisson mixture model is implemented to cluster genes from high- throughput transcriptome sequencing (RNA-seq) data. Parameter estimation is performed using either the EM or CEM algorithm, and the slope heuristics are used for model selection (i.e., to choose the number of clusters).

Maintained by Andrea Rau. Last updated 2 years ago.

5.02 score 7 scripts 1 dependents

syksy

ePCR:Ensemble Penalized Cox Regression for Survival Prediction

The top-performing ensemble-based Penalized Cox Regression (ePCR) framework developed during the DREAM 9.5 mCRPC Prostate Cancer Challenge <https://www.synapse.org/ProstateCancerChallenge> presented in Guinney J, Wang T, Laajala TD, et al. (2017) <doi:10.1016/S1470-2045(16)30560-5> is provided here-in, together with the corresponding follow-up work. While initially aimed at modeling the most advanced stage of prostate cancer, metastatic Castration-Resistant Prostate Cancer (mCRPC), the modeling framework has subsequently been extended to cover also the non-metastatic form of advanced prostate cancer (CRPC). Readily fitted ensemble-based model S4-objects are provided, and a simulated example dataset based on a real-life cohort is provided from the Turku University Hospital, to illustrate the use of the package. Functionality of the ePCR methodology relies on constructing ensembles of strata in patient cohorts and averaging over them, with each ensemble member consisting of a highly optimized penalized/regularized Cox regression model. Various cross-validation and other modeling schema are provided for constructing novel model objects.

Maintained by Teemu Daniel Laajala. Last updated 1 years ago.

5.00 score 20 scripts

bioc

coseq:Co-Expression Analysis of Sequencing Data

Co-expression analysis for expression profiles arising from high-throughput sequencing data. Feature (e.g., gene) profiles are clustered using adapted transformations and mixture models or a K-means algorithm, and model selection criteria (to choose an appropriate number of clusters) are provided.

Maintained by Andrea Rau. Last updated 5 months ago.

geneexpression rnaseq sequencing software immunooncology

4.98 score 16 scripts

shanpengli

FastJM:Semi-Parametric Joint Modeling of Longitudinal and Survival Data

Maximum likelihood estimation for the semi-parametric joint modeling of competing risks and longitudinal data applying customized linear scan algorithms, proposed by Li and colleagues (2022) <doi:10.1155/2022/1362913>. The time-to-event data is modelled using a (cause-specific) Cox proportional hazards regression model with time-fixed covariates. The longitudinal outcome is modelled using a linear mixed effects model. The association is captured by shared random effects. The model is estimated using an Expectation Maximization algorithm.

Maintained by Shanpeng Li. Last updated 11 days ago.

cpp cpp

5 stars 4.95 score 2 scripts 2 dependents

barbarabodinier

sharp:Stability-enHanced Approaches using Resampling Procedures

In stability selection (N Meinshausen, P Bühlmann (2010) <doi:10.1111/j.1467-9868.2010.00740.x>) and consensus clustering (S Monti et al (2003) <doi:10.1023/A:1023949509487>), resampling techniques are used to enhance the reliability of the results. In this package (B Bodinier et al (2025) <doi:10.18637/jss.v112.i05>), hyper-parameters are calibrated by maximising model stability, which is measured under the null hypothesis that all selection (or co-membership) probabilities are identical (B Bodinier et al (2023a) <doi:10.1093/jrsssc/qlad058> and B Bodinier et al (2023b) <doi:10.1093/bioinformatics/btad635>). Functions are readily implemented for the use of LASSO regression, sparse PCA, sparse (group) PLS or graphical LASSO in stability selection, and hierarchical clustering, partitioning around medoids, K means or Gaussian mixture models in consensus clustering.

Maintained by Barbara Bodinier. Last updated 8 days ago.

13 stars 4.91 score 124 scripts

bioc

Melissa:Bayesian clustering and imputationa of single cell methylomes

Melissa is a Baysian probabilistic model for jointly clustering and imputing single cell methylomes. This is done by taking into account local correlations via a Generalised Linear Model approach and global similarities using a mixture modelling approach.

Maintained by C. A. Kapourani. Last updated 5 months ago.

immunooncology dnamethylation geneexpression generegulation epigenetics genetics clustering featureextraction regression rnaseq bayesian kegg sequencing coverage singlecell

4.90 score 7 scripts

bioc

gDNAx:Diagnostics for assessing genomic DNA contamination in RNA-seq data

Provides diagnostics for assessing genomic DNA contamination in RNA-seq data, as well as plots representing these diagnostics. Moreover, the package can be used to get an insight into the strand library protocol used and, in case of strand-specific libraries, the strandedness of the data. Furthermore, it provides functionality to filter out reads of potential gDNA origin.

Maintained by Robert Castelo. Last updated 2 months ago.

transcription transcriptomics rnaseq sequencing preprocessing software geneexpression coverage differentialexpression functionalgenomics splicedalignment alignment

1 stars 4.90 score 3 scripts

adrianhordyk

LBSPR:Length-Based Spawning Potential Ratio

Simulate expected equilibrium length composition, yield-per-recruit, and the spawning potential ratio (SPR) using the length-based SPR (LBSPR) model. Fit the LBSPR model to length data to estimate selectivity, relative apical fishing mortality, and the spawning potential ratio for data-limited fisheries. See Hordyk et al (2016) <doi:10.1139/cjfas-2015-0422> for more information about the LBSPR assessment method.

Maintained by Adrian Hordyk. Last updated 2 years ago.

cpp

7 stars 4.90 score 114 scripts

connor-reid-tiffany

omu:A Metabolomics Analysis Tool for Intuitive Figures and Convenient Metadata Collection

Facilitates the creation of intuitive figures to describe metabolomics data by utilizing Kyoto Encyclopedia of Genes and Genomes (KEGG) hierarchy data, and gathers functional orthology and gene data from the KEGG-REST API.

Maintained by Connor Tiffany. Last updated 1 years ago.

3 stars 4.89 score 52 scripts

bioc

PanomiR:Detection of miRNAs that regulate interacting groups of pathways

PanomiR is a package to detect miRNAs that target groups of pathways from gene expression data. This package provides functionality for generating pathway activity profiles, determining differentially activated pathways between user-specified conditions, determining clusters of pathways via the PCxN package, and generating miRNAs targeting clusters of pathways. These function can be used separately or sequentially to analyze RNA-Seq data.

Maintained by Pourya Naderi. Last updated 5 months ago.

geneexpression genesetenrichment genetarget mirna pathways

3 stars 4.89 score 13 scripts

ellessenne

KMunicate:KMunicate-Style Kaplan–Meier Plots

Produce Kaplan–Meier plots in the style recommended following the KMunicate study by Morris et al. (2019) <doi:10.1136/bmjopen-2019-030215>. The KMunicate style consists of Kaplan-Meier curves with confidence intervals to quantify uncertainty and an extended risk table (per treatment arm) depicting the number of study subjects at risk, events, and censored observations over time. The resulting plots are built using 'ggplot2' and can be further customised to a certain extent, including themes, fonts, and colour scales.

Maintained by Alessandro Gasparini. Last updated 11 months ago.

7 stars 4.89 score 11 scripts

infinitecuriosity

NumericEnsembles:Automatically Runs 23 Individual and 17 Ensembles of Models

Automatically runs 23 individual models and 17 ensembles on numeric data. The package automatically returns complete results on all 40 models, 25 charts, multiple tables. The user simply provides the data, and answers a few questions (for example, how many times would you like to resample the data). From there the package randomly splits the data into train, test and validation sets, builds models on the training data, makes predictions on the test and validation sets, measures root mean squared error (RMSE), removes features above a user-set level of Variance Inflation Factor, and has several optional features including scaling all numeric data, four different ways to handle strings in the data. Perhaps the most significant feature is the package's ability to make predictions using the 40 pre trained models on totally new (untrained) data if the user selects that feature. This feature alone represents a very effective solution to the issue of reproducibility of models in data science. The package can also randomly resample the data as many times as the user sets, thus giving more accurate results than a single run. The graphs provide many results that are not typically found. For example, the package automatically calculates the Kolmogorov-Smirnov test for each of the 40 models and plots a bar chart of the results, a bias bar chart of each of the 40 models, as well as several plots for exploratory data analysis (automatic histograms of the numeric data, automatic histograms of the numeric data). The package also automatically creates a summary report that can be both sorted and searched for each of the 40 models, including RMSE, bias, train RMSE, test RMSE, validation RMSE, overfitting and duration. The best results on the holdout data typically beat the best results in data science competitions and published results for the same data set.

Maintained by Russ Conte. Last updated 3 days ago.

4.88 score

cran

airGRteaching:Teaching Hydrological Modelling with the GR Rainfall-Runoff Models ('Shiny' Interface Included)

Add-on package to the 'airGR' package that simplifies its use and is aimed at being used for teaching hydrology. The package provides 1) three functions that allow to complete very simply a hydrological modelling exercise 2) plotting functions to help students to explore observed data and to interpret the results of calibration and simulation of the GR ('Génie rural') models 3) a 'Shiny' graphical interface that allows for displaying the impact of model parameters on hydrographs and models internal variables.

Maintained by Olivier Delaigue. Last updated 6 days ago.

6 stars 4.86 score

drammock

phonR:Tools for Phoneticians and Phonologists

Tools for phoneticians and phonologists, including functions for normalization and plotting of vowels.

Maintained by Daniel R. McCloy. Last updated 6 years ago.

phonetics plotting vowels

31 stars 4.85 score 23 scripts

mikemeredith

wiqid:Quick and Dirty Estimates for Wildlife Populations

Provides simple, fast functions for maximum likelihood and Bayesian estimates of wildlife population parameters, suitable for use with simulated data or bootstraps. Early versions were indeed quick and dirty, but optional error-checking routines and meaningful error messages have been added. Includes single and multi-season occupancy, closed capture population estimation, survival, species richness and distance measures.

Maintained by Ngumbang Juat. Last updated 2 years ago.

2 stars 4.84 score 115 scripts 1 dependents

bioc

GRmetrics:Calculate growth-rate inhibition (GR) metrics

Functions for calculating and visualizing growth-rate inhibition (GR) metrics.

Maintained by Nicholas Clark. Last updated 5 months ago.

immunooncology cellbasedassays cellbiology software timecourse visualization

1 stars 4.83 score 17 scripts

bioc

evaluomeR:Evaluation of Bioinformatics Metrics

Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.

Maintained by José Antonio Bernabé-Díaz. Last updated 5 months ago.

clustering classification featureextraction assessment clustering-evaluation evaluome evaluomer metrics

4.82 score 33 scripts

ocbe-uio

BayesSurvive:Bayesian Survival Models for High-Dimensional Data

An implementation of Bayesian survival models with graph-structured selection priors for sparse identification of omics features predictive of survival (Madjar et al., 2021 <doi:10.1186/s12859-021-04483-z>) and its extension to use a fixed graph via a Markov Random Field (MRF) prior for capturing known structure of omics features, e.g. disease-specific pathways from the Kyoto Encyclopedia of Genes and Genomes database (Hermansen et al., 2025 <doi:10.48550/arXiv.2503.13078>).

Maintained by Zhi Zhao. Last updated 13 days ago.

bayesian-cox-models bayesian-variable-selection graph-learning high-dimensional-statistics omics-data-integration survival-analysis openblas cpp openmp

4.78 score 1 scripts

jvadams

LW1949:An Automated Approach to Evaluating Dose-Effect Experiments Following Litchfield and Wilcoxon (1949)

The manual approach of Litchfield and Wilcoxon (1949) <http://jpet.aspetjournals.org/content/96/2/99.abstract> for evaluating dose-effect experiments is automated so that the computer can do the work.

Maintained by Jean V. Adams. Last updated 8 years ago.

3 stars 4.78 score 40 scripts

bioc

iCARE:Individualized Coherent Absolute Risk Estimation (iCARE)

An R package to build, validate and apply absolute risk models

Maintained by Parichoy Pal Choudhury. Last updated 5 months ago.

software statisticalmethod genomewideassociation

4.78 score 9 scripts

bioc

seqPattern:Visualising oligonucleotide patterns and motif occurrences across a set of sorted sequences

Visualising oligonucleotide patterns and sequence motifs occurrences across a large set of sequences centred at a common reference point and sorted by a user defined feature.

Maintained by Vanja Haberle. Last updated 5 months ago.

visualization sequencematching

4.77 score 12 scripts 7 dependents

peterbiber

viscomplexr:Phase Portraits of Functions in the Complex Number Plane

Functionality for creating phase portraits of functions in the complex number plane. Works with R base graphics, whose full functionality is available. Parallel processing is used for optimum performance.

Maintained by Peter Biber. Last updated 4 months ago.

cpp

4 stars 4.75 score 14 scripts

leondap

recluster:Ordination Methods for the Analysis of Beta-Diversity Indices

The analysis of different aspects of biodiversity requires specific algorithms. For example, in regionalisation analyses, the high frequency of ties and zero values in dissimilarity matrices produced by Beta-diversity turnover produces hierarchical cluster dendrograms whose topology and bootstrap supports are affected by the order of rows in the original matrix. Moreover, visualisation of biogeographical regionalisation can be facilitated by a combination of hierarchical clustering and multi-dimensional scaling. The recluster package provides robust techniques to visualise and analyse pattern of biodiversity and to improve occurrence data for cryptic taxa.

Maintained by Leonardo Dapporto. Last updated 5 months ago.

4 stars 4.69 score 41 scripts

khliland

MatrixCorrelation:Matrix Correlation Coefficients

Computation and visualization of matrix correlation coefficients. The main method is the Similarity of Matrices Index, while various related measures like r1, r2, r3, r4, Yanai's GCD, RV, RV2, adjusted RV, Rozeboom's linear correlation and Coxhead's coefficient are included for comparison and flexibility.

Maintained by Kristian Hovde Liland. Last updated 2 years ago.

openblas cpp

2 stars 4.66 score 38 scripts 2 dependents

smn74

MANOVA.RM:Resampling-Based Analysis of Multivariate Data and Repeated Measures Designs

Implemented are various tests for semi-parametric repeated measures and general MANOVA designs that do neither assume multivariate normality nor covariance homogeneity, i.e., the procedures are applicable for a wide range of general multivariate factorial designs. In addition to asymptotic inference methods, novel bootstrap and permutation approaches are implemented as well. These provide more accurate results in case of small to moderate sample sizes. Furthermore, post-hoc comparisons are provided for the multivariate analyses. Friedrich, S., Konietschke, F. and Pauly, M. (2019) <doi:10.32614/RJ-2019-051>.

Maintained by Sarah Friedrich. Last updated 2 months ago.

multivariate-data permutation repeated-measures resampling

11 stars 4.63 score 39 scripts

bioc

FGNet:Functional Gene Networks derived from biological enrichment analyses

Build and visualize functional gene and term networks from clustering of enrichment analyses in multiple annotation spaces. The package includes a graphical user interface (GUI) and functions to perform the functional enrichment analysis through DAVID, GeneTerm Linker, gage (GSEA) and topGO.

Maintained by Sara Aibar. Last updated 5 months ago.

annotation go pathways genesetenrichment network visualization functionalgenomics networkenrichment clustering

4.62 score 5 scripts 1 dependents

bioc

IntramiRExploreR:Predicting Targets for Drosophila Intragenic miRNAs

Intra-miR-ExploreR, an integrative miRNA target prediction bioinformatics tool, identifies targets combining expression and biophysical interactions of a given microRNA (miR). Using the tool, we have identified targets for 92 intragenic miRs in D. melanogaster, using available microarray expression data, from Affymetrix 1 and Affymetrix2 microarray array platforms, providing a global perspective of intragenic miR targets in Drosophila. Predicted targets are grouped according to biological functions using the DAVID Gene Ontology tool and are ranked based on a biologically relevant scoring system, enabling the user to identify functionally relevant targets for a given miR.

Maintained by Surajit Bhattacharya. Last updated 5 months ago.

software microarray genetarget statisticalmethod geneexpression geneprediction

4.60 score 4 scripts

bioc

ddPCRclust:Clustering algorithm for ddPCR data

The ddPCRclust algorithm can automatically quantify the CPDs of non-orthogonal ddPCR reactions with up to four targets. In order to determine the correct droplet count for each target, it is crucial to both identify all clusters and label them correctly based on their position. For more information on what data can be analyzed and how a template needs to be formatted, please check the vignette.

Maintained by Benedikt G. Brink. Last updated 5 months ago.

ddpcr clustering biological-data-analysis

4 stars 4.60 score 4 scripts

bioc

dce:Pathway Enrichment Based on Differential Causal Effects

Compute differential causal effects (dce) on (biological) networks. Given observational samples from a control experiment and non-control (e.g., cancer) for two genes A and B, we can compute differential causal effects with a (generalized) linear regression. If the causal effect of gene A on gene B in the control samples is different from the causal effect in the non-control samples the dce will differ from zero. We regularize the dce computation by the inclusion of prior network information from pathway databases such as KEGG.

Maintained by Kim Philipp Jablonski. Last updated 4 months ago.

software statisticalmethod graphandnetwork regression geneexpression differentialexpression networkenrichment network kegg bioconductor causality

13 stars 4.59 score 4 scripts

hanjunwei-lab

ICDS:Identification of Cancer Dysfunctional Subpathway with Omics Data

Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways.

Maintained by Junwei Han. Last updated 8 months ago.

4.54 score 3 scripts

damianobaldan

RAC:R Package for Aqua Culture

Solves the individual bioenergetic balance for different aquaculture sea fish (Sea Bream and Sea Bass; Brigolin et al., 2014 <doi:10.3354/aei00093>) and shellfish (Mussel and Clam; Brigolin et al., 2009 <doi:10.1016/j.ecss.2009.01.029>; Solidoro et al., 2000 <doi:10.3354/meps199137>). Allows for spatialized model runs and population simulations.

Maintained by Baldan D.. Last updated 2 years ago.

4.54 score

rkbauer

oceanmap:A Plotting Toolbox for 2D Oceanographic Data

Plotting toolbox for 2D oceanographic data (satellite data, sea surface temperature, chlorophyll, ocean fronts & bathymetry). Recognized classes and formats include netcdf, Raster, '.nc' and '.gz' files.

Maintained by Robert K. Bauer. Last updated 1 years ago.

bathymetry chla ggplot mapping-tools ncdf oceanographic-data remote-sensing satellite-im spatial-data sst

4 stars 4.54 score 58 scripts 1 dependents

paulocortez4

rminer:Data Mining Classification and Regression Methods

Facilitates the use of data mining algorithms in classification and regression (including time series forecasting) tasks by presenting a short and coherent set of functions. Versions: 1.4.8 improved help, several warning and error code fixes (more stable version, all examples run correctly); 1.4.7 improved Importance function and examples, minor error fixes; 1.4.6 / 1.4.5 / 1.4.4 new automated machine learning (AutoML) and ensembles, via improved fit(), mining() and mparheuristic() functions, and new categorical preprocessing, via improved delevels() function; 1.4.3 new metrics (e.g., macro precision, explained variance), new "lssvm" model and improved mparheuristic() function; 1.4.2 new "NMAE" metric, "xgboost" and "cv.glmnet" models (16 classification and 18 regression models); 1.4.1 new tutorial and more robust version; 1.4 - new classification and regression models, with a total of 14 classification and 15 regression methods, including: Decision Trees, Neural Networks, Support Vector Machines, Random Forests, Bagging and Boosting; 1.3 and 1.3.1 - new classification and regression metrics; 1.2 - new input importance methods via improved Importance() function; 1.0 - first version.

Maintained by Paulo Cortez. Last updated 5 months ago.

3 stars 4.52 score 536 scripts

bioc

TDbasedUFEadv:Advanced package of tensor decomposition based unsupervised feature extraction

This is an advanced version of TDbasedUFE, which is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. In contrast to TDbasedUFE which can perform simple the feature selection and the multiomics analyses, this package can perform more complicated and advanced features, but they are not so popularly required. Only users who require more specific features can make use of its functionality.

Maintained by Y-h. Taguchi. Last updated 5 months ago.

geneexpression featureextraction methylationarray singlecell software bioconductor-package bioinformatics tensor-decomposition

4.48 score 4 scripts

ocbe-uio

psbcSpeedUp:Penalized Semiparametric Bayesian Cox Models

Algorithms to speed up the Bayesian Lasso Cox model (Lee et al., Int J Biostat, 2011 <doi:10.2202/1557-4679.1301>) and the Bayesian Lasso Cox with mandatory variables (Zucknick et al. Biometrical J, 2015 <doi:10.1002/bimj.201400160>).

Maintained by Zhi Zhao. Last updated 9 months ago.

bayesian-cox-models omics-data survival-analysis openblas cpp openmp

3 stars 4.48 score

syoung9836

knfi:Analysis of Korean National Forest Inventory Database

Understanding the current status of forest resources is essential for monitoring changes in forest ecosystems and generating related statistics. In South Korea, the National Forest Inventory (NFI) surveys over 4,500 sample plots nationwide every five years and records 70 items, including forest stand, forest resource, and forest vegetation surveys. Many researchers use NFI as the primary data for research, such as biomass estimation or analyzing the importance value of each species over time and space, depending on the research purpose. However, the large volume of accumulated forest survey data from across the country can make it challenging to manage and utilize such a vast dataset. To address this issue, we developed an R package that efficiently handles large-scale NFI data across time and space. The package offers a comprehensive workflow for NFI data analysis. It starts with data processing, where read_nfi() function reconstructs NFI data according to the researcher's needs while performing basic integrity checks for data quality.Following this, the package provides analytical tools that operate on the verified data. These include functions like summary_nfi() for summary statistics, diversity_nfi() for biodiversity analysis, iv_nfi() for calculating species importance value, and biomass_nfi() and cwd_biomass_nfi() for biomass estimation. Finally, for visualization, the tsvis_nfi() function generates graphs and maps, allowing users to visualize forest ecosystem changes across various spatial and temporal scales. This integrated approach and its specialized functions can enhance the efficiency of processing and analyzing NFI data, providing researchers with insights into forest ecosystems. The NFI Excel files (.xlsx) are not included in the R package and must be downloaded separately. Users can access these NFI Excel files by visiting the Korea Forest Service Forestry Statistics Platform <https://kfss.forest.go.kr/stat/ptl/article/articleList.do?curMenu=11694&bbsId=microdataboard> to download the annual NFI Excel files, which are bundled in .zip archives. Please note that this website is only available in Korean, and direct download links can be found in the notes section of the read_nfi() function.

Maintained by Sinyoung Park. Last updated 4 months ago.

data-analysis-r forestry

1 stars 4.48 score 2 scripts

bioc

PPInfer:Inferring functionally related proteins using protein interaction networks

Interactions between proteins occur in many, if not most, biological processes. Most proteins perform their functions in networks associated with other proteins and other biomolecules. This fact has motivated the development of a variety of experimental methods for the identification of protein interactions. This variety has in turn ushered in the development of numerous different computational approaches for modeling and predicting protein interactions. Sometimes an experiment is aimed at identifying proteins closely related to some interesting proteins. A network based statistical learning method is used to infer the putative functions of proteins from the known functions of its neighboring proteins on a PPI network. This package identifies such proteins often involved in the same or similar biological functions.

Maintained by Dongmin Jung. Last updated 5 months ago.

software statisticalmethod network graphandnetwork genesetenrichment networkenrichment pathways

4.48 score 4 scripts 1 dependents

r-forge

stops:Structure Optimized Proximity Scaling

Methods that use flexible variants of multidimensional scaling (MDS) which incorporate parametric nonlinear distance transformations and trade-off the goodness-of-fit fit with structure considerations to find optimal hyperparameters, also known as structure optimized proximity scaling (STOPS) (Rusch, Mair & Hornik, 2023,<doi:10.1007/s11222-022-10197-w>). The package contains various functions, wrappers, methods and classes for fitting, plotting and displaying different 1-way MDS models with ratio, interval, ordinal optimal scaling in a STOPS framework. These cover essentially the functionality of the package smacofx, including Torgerson (classical) scaling with power transformations of dissimilarities, SMACOF MDS with powers of dissimilarities, Sammon mapping with powers of dissimilarities, elastic scaling with powers of dissimilarities, spherical SMACOF with powers of dissimilarities, (ALSCAL) s-stress MDS with powers of dissimilarities, r-stress MDS, MDS with powers of dissimilarities and configuration distances, elastic scaling powers of dissimilarities and configuration distances, Sammon mapping powers of dissimilarities and configuration distances, power stress MDS (POST-MDS), approximate power stress, Box-Cox MDS, local MDS, Isomap, curvilinear component analysis (CLCA), curvilinear distance analysis (CLDA) and sparsified (power) multidimensional scaling and (power) multidimensional distance analysis (experimental models from smacofx influenced by CLCA). All of these models can also be fit by optimizing over hyperparameters based on goodness-of-fit fit only (i.e., no structure considerations). The package further contains functions for optimization, specifically the adaptive Luus-Jaakola algorithm and a wrapper for Bayesian optimization with treed Gaussian process with jumps to linear models, and functions for various c-structuredness indices.

Maintained by Thomas Rusch. Last updated 3 months ago.

openjdk

1 stars 4.48 score 23 scripts

r-forge

cops:Cluster Optimized Proximity Scaling

Multidimensional scaling (MDS) methods that aim at pronouncing the clustered appearance of the configuration (Rusch, Mair & Hornik, 2021, <doi:10.1080/10618600.2020.1869027>). They achieve this by transforming proximities/distances with explicit power functions and penalizing the fitting criterion with a clusteredness index, the OPTICS Cordillera (Rusch, Hornik & Mair, 2018, <doi:10.1080/10618600.2017.1349664>). There are two variants: One for finding the configuration directly (COPS-C) for any Minkowski distance with given explicit power transformations and implicit ratio, interval and nonmetric optimal scaling transformations (Borg & Groenen, 2005, ISBN:978-0-387-28981-6), and one for using the augmented fitting criterion to find optimal hyperparameters for the explicit transformations (P-COPS). The package contains various functions, wrappers, methods and classes for fitting, plotting and displaying a large number of different MDS models (most of the functionality in smacofx) in the COPS framework. The package further contains a function for pattern search optimization, the ``Adaptive Luus-Jaakola Algorithm'' (Rusch, Mair & Hornik, 2021,<doi:10.1080/10618600.2020.1869027>) and a functions to calculate the phi-distances for count data or histograms.

Maintained by Thomas Rusch. Last updated 3 months ago.

1 stars 4.48 score 23 scripts

joemsong

OptCirClust:Circular, Periodic, or Framed Data Clustering: Fast, Optimal, and Reproducible

Fast, optimal, and reproducible clustering algorithms for circular, periodic, or framed data. The algorithms introduced here are based on a core algorithm for optimal framed clustering the authors have developed (Debnath & Song 2021) <doi:10.1109/TCBB.2021.3077573>. The runtime of these algorithms is O(K N log^2 N), where K is the number of clusters and N is the number of circular data points. On a desktop computer using a single processor core, millions of data points can be grouped into a few clusters within seconds. One can apply the algorithms to characterize events along circular DNA molecules, circular RNA molecules, and circular genomes of bacteria, chloroplast, and mitochondria. One can also cluster climate data along any given longitude or latitude. Periodic data clustering can be formulated as circular clustering. The algorithms offer a general high-performance solution to circular, periodic, or framed data clustering.

Maintained by Joe Song. Last updated 4 years ago.

cpp

4.42 score 22 scripts 2 dependents

bioc

PRONE:The PROteomics Normalization Evaluator

High-throughput omics data are often affected by systematic biases introduced throughout all the steps of a clinical study, from sample collection to quantification. Normalization methods aim to adjust for these biases to make the actual biological signal more prominent. However, selecting an appropriate normalization method is challenging due to the wide range of available approaches. Therefore, a comparative evaluation of unnormalized and normalized data is essential in identifying an appropriate normalization strategy for a specific data set. This R package provides different functions for preprocessing, normalizing, and evaluating different normalization approaches. Furthermore, normalization methods can be evaluated on downstream steps, such as differential expression analysis and statistical enrichment analysis. Spike-in data sets with known ground truth and real-world data sets of biological experiments acquired by either tandem mass tag (TMT) or label-free quantification (LFQ) can be analyzed.

Maintained by Lis Arend. Last updated 12 days ago.

proteomics preprocessing normalization differentialexpression visualization data-analysis evaluation

2 stars 4.41 score 9 scripts

anttonalberdi

hilldiv:Integral Analysis of Diversity Based on Hill Numbers

Tools for analysing, comparing, visualising and partitioning diversity based on Hill numbers. 'hilldiv' is an R package that provides a set of functions to assist analysis of diversity for diet reconstruction, microbial community profiling or more general ecosystem characterisation analyses based on Hill numbers, using OTU/ASV tables and associated phylogenetic trees as inputs. The package includes functions for (phylo)diversity measurement, (phylo)diversity profile plotting, (phylo)diversity comparison between samples and groups, (phylo)diversity partitioning and (dis)similarity measurement. All of these grounded in abundance-based and incidence-based Hill numbers. The statistical framework developed around Hill numbers encompasses many of the most broadly employed diversity (e.g. richness, Shannon index, Simpson index), phylogenetic diversity (e.g. Faith's PD, Allen's H, Rao's quadratic entropy) and dissimilarity (e.g. Sorensen index, Unifrac distances) metrics. This enables the most common analyses of diversity to be performed while grounded in a single statistical framework. The methods are described in Jost et al. (2007) <DOI:10.1890/06-1736.1>, Chao et al. (2010) <DOI:10.1098/rstb.2010.0272> and Chiu et al. (2014) <DOI:10.1890/12-0960.1>; and reviewed in the framework of molecularly characterised biological systems in Alberdi & Gilbert (2019) <DOI:10.1111/1755-0998.13014>.

Maintained by Antton Alberdi. Last updated 4 years ago.

11 stars 4.35 score 41 scripts

foucher-y

RISCA:Causal Inference and Prediction in Cohort-Based Analyses

Numerous functions for cohort-based analyses, either for prediction or causal inference. For causal inference, it includes Inverse Probability Weighting and G-computation for marginal estimation of an exposure effect when confounders are expected. We deal with binary outcomes, times-to-events, competing events, and multi-state data. For multistate data, semi-Markov model with interval censoring may be considered, and we propose the possibility to consider the excess of mortality related to the disease compared to reference lifetime tables. For predictive studies, we propose a set of functions to estimate time-dependent receiver operating characteristic (ROC) curves with the possible consideration of right-censoring times-to-events or the presence of confounders. Finally, several functions are available to assess time-dependent ROC curves or survival curves from aggregated data.

Maintained by Yohann Foucher. Last updated 1 months ago.

1 stars 4.33 score 47 scripts

bioc

cytofQC:Labels normalized cells for CyTOF data and assigns probabilities for each label

cytofQC is a package for initial cleaning of CyTOF data. It uses a semi-supervised approach for labeling cells with their most likely data type (bead, doublet, debris, dead) and the probability that they belong to each label type. This package does not remove data from the dataset, but provides labels and information to aid the data user in cleaning their data. Our algorithm is able to distinguish between doublets and large cells.

Maintained by Jill Lundell. Last updated 5 months ago.

software singlecell annotation

2 stars 4.30 score 3 scripts

bioc

XINA:Multiplexes Isobaric Mass Tagged-based Kinetics Data for Network Analysis

The aim of XINA is to determine which proteins exhibit similar patterns within and across experimental conditions, since proteins with co-abundance patterns may have common molecular functions. XINA imports multiple datasets, tags dataset in silico, and combines the data for subsequent subgrouping into multiple clusters. The result is a single output depicting the variation across all conditions. XINA, not only extracts coabundance profiles within and across experiments, but also incorporates protein-protein interaction databases and integrative resources such as KEGG to infer interactors and molecular functions, respectively, and produces intuitive graphical outputs.

Maintained by Lang Ho Lee. Last updated 5 months ago.

systemsbiology proteomics rnaseq network

4.30 score 3 scripts

bioc

spillR:Spillover Compensation in Mass Cytometry Data

Channel interference in mass cytometry can cause spillover and may result in miscounting of protein markers. We develop a nonparametric finite mixture model and use the mixture components to estimate the probability of spillover. We implement our method using expectation-maximization to fit the mixture model.

Maintained by Marco Guazzini. Last updated 5 months ago.

flowcytometry immunooncology massspectrometry preprocessing singlecell software statisticalmethod visualization regression

4.30 score 3 scripts

jiangyouxiang

TestAnaAPP:A 'shiny' App for Test Analysis and Visualization

This application provides exploratory and confirmatory factor analysis, classical test theory, unidimensional and multidimensional item response theory, and continuous item response model analysis, through the 'shiny' interactive interface. In addition, it offers rich functionalities for visualizing and downloading results. Users can download figures, tables, and analysis reports via the interactive interface.

Maintained by Youxiang Jiang. Last updated 4 months ago.

4 stars 4.30 score 2 scripts

bioc

MEDME:Modelling Experimental Data from MeDIP Enrichment

MEDME allows the prediction of absolute and relative methylation levels based on measures obtained by MeDIP-microarray experiments

Maintained by Mattia Pelizzola. Last updated 5 months ago.

microarray cpgisland dnamethylation

4.30 score 2 scripts

bioc

rqt:rqt: utilities for gene-level meta-analysis

Despite the recent advances of modern GWAS methods, it still remains an important problem of addressing calculation an effect size and corresponding p-value for the whole gene rather than for single variant. The R- package rqt offers gene-level GWAS meta-analysis. For more information, see: "Gene-set association tests for next-generation sequencing data" by Lee et al (2016), Bioinformatics, 32(17), i611-i619, <doi:10.1093/bioinformatics/btw429>.

Maintained by Ilya Zhbannikov. Last updated 5 months ago.

genomewideassociation regression survival principalcomponent statisticalmethod sequencing

2 stars 4.30 score 4 scripts

bioc

SpatialOmicsOverlay:Spatial Overlay for Omic Data from Nanostring GeoMx Data

Tools for NanoString Technologies GeoMx Technology. Package to easily graph on top of an OME-TIFF image. Plotting annotations can range from tissue segment to gene expression.

Maintained by Maddy Griswold. Last updated 5 months ago.

geneexpression transcription cellbasedassays dataimport transcriptomics proteomics proprietaryplatforms rnaseq spatial datarepresentation visualization openjdk

4.30 score 8 scripts

alvesks

ec50estimator:An Automated Way to Estimate EC50 for Stratified Datasets

An implementation for estimating Effective control to 50% of growth inhibition (EC50) for multi isolates and stratified datasets. It implements functions from the drc package in a way that is displayed a tidy data.frame as output. Info about the drc package is available in Ritz C, Baty F, Streibig JC, Gerhard D (2015) <doi:10.1371/journal.pone.0146021>.

Maintained by Kaique dos S. Alves. Last updated 3 years ago.

4.29 score 39 scripts

quirinms

mrf:Multiresolution Forecasting

Forecasting of univariate time series using feature extraction with variable prediction methods is provided. Feature extraction is done with a redundant Haar wavelet transform with filter h = (0.5, 0.5). The advantage of the approach compared to typical Fourier based methods is an dynamic adaptation to varying seasonalities. Currently implemented prediction methods based on the selected wavelets levels and scales are a regression and a multi-layer perceptron. Forecasts can be computed for horizon 1 or higher. Model selection is performed with an evolutionary optimization. Selection criteria are currently the AIC criterion, the Mean Absolute Error or the Mean Root Error. The data is split into three parts for model selection: Training, test, and evaluation dataset. The training data is for computing the weights of a parameter set. The test data is for choosing the best parameter set. The evaluation data is for assessing the forecast performance of the best parameter set on new data unknown to the model. This work is published in Stier, Q.; Gehlert, T.; Thrun, M.C. Multiresolution Forecasting for Industrial Applications. Processes 2021, 9, 1697. <https://doi.org/10.3390/pr9101697>.

Maintained by Quirin Stier. Last updated 4 years ago.

2 stars 4.28 score 19 scripts

mauricio1986

gmnl:Multinomial Logit Models with Random Parameters

An implementation of maximum simulated likelihood method for the estimation of multinomial logit models with random coefficients as presented by Sarrias and Daziano (2017) <doi:10.18637/jss.v079.i02>. Specifically, it allows estimating models with continuous heterogeneity such as the mixed multinomial logit and the generalized multinomial logit. It also allows estimating models with discrete heterogeneity such as the latent class and the mixed-mixed multinomial logit model.

Maintained by Mauricio Sarrias. Last updated 3 years ago.

4 stars 4.27 score 51 scripts

cran

relsurv:Relative Survival

Contains functions for analysing relative survival data, including nonparametric estimators of net (marginal relative) survival, relative survival ratio, crude mortality, methods for fitting and checking additive and multiplicative regression models, transformation approach, methods for dealing with population mortality tables. Work has been described in Pohar Perme, Pavlic (2018) <doi:10.18637/jss.v087.i08>.

Maintained by Damjan Manevski. Last updated 2 months ago.

openblas cpp

3 stars 4.25 score 4 dependents

bioc

HERON:Hierarchical Epitope pROtein biNding

HERON is a software package for analyzing peptide binding array data. In addition to identifying significant binding probes, HERON also provides functions for finding epitopes (string of consecutive peptides within a protein). HERON also calculates significance on the probe, epitope, and protein level by employing meta p-value methods. HERON is designed for obtaining calls on the sample level and calculates fractions of hits for different conditions.

Maintained by Sean McIlwain. Last updated 5 months ago.

microarray software

1 stars 4.18 score 6 scripts

bioc

NADfinder:Call wide peaks for sequencing data

Nucleolus is an important structure inside the nucleus in eukaryotic cells. It is the site for transcribing rDNA into rRNA and for assembling ribosomes, aka ribosome biogenesis. In addition, nucleoli are dynamic hubs through which numerous proteins shuttle and contact specific non-rDNA genomic loci. Deep sequencing analyses of DNA associated with isolated nucleoli (NAD- seq) have shown that specific loci, termed nucleolus- associated domains (NADs) form frequent three- dimensional associations with nucleoli. NAD-seq has been used to study the biological functions of NAD and the dynamics of NAD distribution during embryonic stem cell (ESC) differentiation. Here, we developed a Bioconductor package NADfinder for bioinformatic analysis of the NAD-seq data, including baseline correction, smoothing, normalization, peak calling, and annotation.

Maintained by Jianhong Ou. Last updated 3 months ago.

sequencing dnaseq generegulation peakdetection

4.18 score 1 scripts

moondog1969

streamDAG:Analytical Methods for Stream DAGs

Provides indices and tools for directed acyclic graphs (DAGs), particularly DAG representations of intermittent streams. A detailed introduction to the package can be found in the publication: "Non-perennial stream networks as directed acyclic graphs: The R-package streamDAG" (Aho et al., 2023) <doi:10.1016/j.envsoft.2023.105775>, and in the introductory package vignette.

Maintained by Ken Aho. Last updated 6 months ago.

1 stars 4.18 score 4 scripts

vmoprojs

GeoModels:Procedures for Gaussian and Non Gaussian Geostatistical (Large) Data Analysis

Functions for Gaussian and Non Gaussian (bivariate) spatial and spatio-temporal data analysis are provided for a) (fast) simulation of random fields, b) inference for random fields using standard likelihood and a likelihood approximation method called weighted composite likelihood based on pairs and b) prediction using (local) best linear unbiased prediction. Weighted composite likelihood can be very efficient for estimating massive datasets. Both regression and spatial (temporal) dependence analysis can be jointly performed. Flexible covariance models for spatial and spatial-temporal data on Euclidean domains and spheres are provided. There are also many useful functions for plotting and performing diagnostic analysis. Different non Gaussian random fields can be considered in the analysis. Among them, random fields with marginal distributions such as Skew-Gaussian, Student-t, Tukey-h, Sin-Arcsin, Two-piece, Weibull, Gamma, Log-Gaussian, Binomial, Negative Binomial and Poisson. See the URL for the papers associated with this package, as for instance, Bevilacqua and Gaetan (2015) <doi:10.1007/s11222-014-9460-6>, Bevilacqua et al. (2016) <doi:10.1007/s13253-016-0256-3>, Vallejos et al. (2020) <doi:10.1007/978-3-030-56681-4>, Bevilacqua et. al (2020) <doi:10.1002/env.2632>, Bevilacqua et. al (2021) <doi:10.1111/sjos.12447>, Bevilacqua et al. (2022) <doi:10.1016/j.jmva.2022.104949>, Morales-Navarrete et al. (2023) <doi:10.1080/01621459.2022.2140053>, and a large class of examples and tutorials.

Maintained by Moreno Bevilacqua. Last updated 2 months ago.

fortran openblas glibc

3 stars 4.17 score 83 scripts

myaseen208

StroupGLMM:R Codes and Datasets for Generalized Linear Mixed Models: Modern Concepts, Methods and Applications by Walter W. Stroup

R Codes and Datasets for Stroup, W. W. (2012). Generalized Linear Mixed Models Modern Concepts, Methods and Applications, CRC Press.

Maintained by Muhammad Yaseen. Last updated 6 months ago.

glm glmm lm lmm

13 stars 4.11 score 2 scripts

gaurbans

ecm:Build Error Correction Models

Functions for easy building of error correction models (ECM) for time series regression.

Maintained by Gaurav Bansal. Last updated 1 years ago.

3 stars 4.11 score 62 scripts

onofriandreapg

drcte:Statistical Approaches for Time-to-Event Data in Agriculture

A specific and comprehensive framework for the analyses of time-to-event data in agriculture. Fit non-parametric and parametric time-to-event models. Compare time-to-event curves for different experimental groups. Plots and other displays. It is particularly tailored to the analyses of data from germination and emergence assays. The methods are described in Onofri et al. (2022) "A unified framework for the analysis of germination, emergence, and other time-to-event data in weed science", Weed Science, 70, 259-271 <doi:10.1017/wsc.2022.8>.

Maintained by Andrea Onofri. Last updated 9 days ago.

non-linear-regression seed-germination time-to-event

4.07 score 39 scripts 2 dependents

pbourkey

polymapR:Linkage Analysis in Outcrossing Polyploids

Creation of linkage maps in polyploid species from marker dosage scores of an F1 cross from two heterozygous parents. Currently works for outcrossing diploid, autotriploid, autotetraploid and autohexaploid species, as well as segmental allotetraploids. Methods are described in a manuscript of Bourke et al. (2018) <doi:10.1093/bioinformatics/bty371>. Since version 1.1.0, both discrete and probabilistic genotypes are acceptable input; for more details on the latter see Liao et al. (2021) <doi:10.1007/s00122-021-03834-x>.

Maintained by Peter Bourke. Last updated 10 months ago.

1 stars 4.03 score 54 scripts

bioc

profileplyr:Visualization and annotation of read signal over genomic ranges with profileplyr

Quick and straightforward visualization of read signal over genomic intervals is key for generating hypotheses from sequencing data sets (e.g. ChIP-seq, ATAC-seq, bisulfite/methyl-seq). Many tools both inside and outside of R and Bioconductor are available to explore these types of data, and they typically start with a bigWig or BAM file and end with some representation of the signal (e.g. heatmap). profileplyr leverages many Bioconductor tools to allow for both flexibility and additional functionality in workflows that end with visualization of the read signal.

Maintained by Tom Carroll. Last updated 5 months ago.

chipseq dataimport sequencing chiponchip coverage

4.03 score 54 scripts

bioc

GMRP:GWAS-based Mendelian Randomization and Path Analyses

Perform Mendelian randomization analysis of multiple SNPs to determine risk factors causing disease of study and to exclude confounding variabels and perform path analysis to construct path of risk factors to the disease.

Maintained by Yuan-De Tan. Last updated 5 months ago.

sequencing regression snp

4.00 score 3 scripts

bioc

fCCAC:functional Canonical Correlation Analysis to evaluate Covariance between nucleic acid sequencing datasets

Computational evaluation of variability across DNA or RNA sequencing datasets is a crucial step in genomics, as it allows both to evaluate reproducibility of replicates, and to compare different datasets to identify potential correlations. fCCAC applies functional Canonical Correlation Analysis to allow the assessment of: (i) reproducibility of biological or technical replicates, analyzing their shared covariance in higher order components; and (ii) the associations between different datasets. fCCAC represents a more sophisticated approach that complements Pearson correlation of genomic coverage.

Maintained by Pedro Madrigal. Last updated 5 months ago.

epigenetics transcription sequencing coverage chipseq functionalgenomics rnaseq atacseq mnaseseq

4.00 score 1 scripts

bioimaginggroup

cmR:Analysis of Cardiac Magnetic Resonance Images

Computes maximum response from Cardiac Magnetic Resonance Images using spatial and voxel wise spline based Bayesian model. This is an implementation of the methods described in Schmid (2011) <doi:10.1109/TMI.2011.2109733> "Voxel-Based Adaptive Spatio-Temporal Modelling of Perfusion Cardiovascular MRI". IEEE TMI 30(7) p. 1305 - 1313.

Maintained by Volker Schmid. Last updated 2 years ago.

2 stars 4.00 score 9 scripts

bioc

SCANVIS:SCANVIS - a tool for SCoring, ANnotating and VISualizing splice junctions

SCANVIS is a set of annotation-dependent tools for analyzing splice junctions and their read support as predetermined by an alignment tool of choice (for example, STAR aligner). SCANVIS assesses each junction's relative read support (RRS) by relating to the context of local split reads aligning to annotated transcripts. SCANVIS also annotates each splice junction by indicating whether the junction is supported by annotation or not, and if not, what type of junction it is (e.g. exon skipping, alternative 5' or 3' events, Novel Exons). Unannotated junctions are also futher annotated by indicating whether it induces a frame shift or not. SCANVIS includes a visualization function to generate static sashimi-style plots depicting relative read support and number of split reads using arc thickness and arc heights, making it easy for users to spot well-supported junctions. These plots also clearly delineate unannotated junctions from annotated ones using designated color schemes, and users can also highlight splice junctions of choice. Variants and/or a read profile are also incoroporated into the plot if the user supplies variants in bed format and/or the BAM file. One further feature of the visualization function is that users can submit multiple samples of a certain disease or cohort to generate a single plot - this occurs via a "merge" function wherein junction details over multiple samples are merged to generate a single sashimi plot, which is useful when contrasting cohorots (eg. disease vs control).

Maintained by Phaedra Agius. Last updated 5 months ago.

software researchfield transcriptomics workflowstep annotation visualization

4.00 score 2 scripts

bioc

gsean:Gene Set Enrichment Analysis with Networks

Biological molecules in a living organism seldom work individually. They usually interact each other in a cooperative way. Biological process is too complicated to understand without considering such interactions. Thus, network-based procedures can be seen as powerful methods for studying complex process. However, many methods are devised for analyzing individual genes. It is said that techniques based on biological networks such as gene co-expression are more precise ways to represent information than those using lists of genes only. This package is aimed to integrate the gene expression and biological network. A biological network is constructed from gene expression data and it is used for Gene Set Enrichment Analysis.

Maintained by Dongmin Jung. Last updated 5 months ago.

software statisticalmethod network graphandnetwork genesetenrichment geneexpression networkenrichment pathways differentialexpression

4.00 score 1 scripts

bioc

CexoR:An R package to uncover high-resolution protein-DNA interactions in ChIP-exo replicates

Strand specific peak-pair calling in ChIP-exo replicates. The cumulative Skellam distribution function is used to detect significant normalised count differences of opposed sign at each DNA strand (peak-pairs). Then, irreproducible discovery rate for overlapping peak-pairs across biological replicates is computed.

Maintained by Pedro Madrigal. Last updated 5 months ago.

functionalgenomics sequencing coverage chipseq peakdetection

4.00 score 1 scripts

bioc

scTensor:Detection of cell-cell interaction from single-cell RNA-seq dataset by tensor decomposition

The algorithm is based on the non-negative tucker decomposition (NTD2) of nnTensor.

Maintained by Koki Tsuyuzaki. Last updated 5 months ago.

dimensionreduction singlecell software geneexpression

4.00 score 2 scripts

bioc

SARC:Statistical Analysis of Regions with CNVs

Imports a cov/coverage file (normalised read coverages from BAM files) and a cnv file (list of CNVs - similiar to a BED file) from WES/ WGS CNV (copy number variation) detection pipelines and utilises several metrics to weigh the likelihood of a sample containing a detected CNV being a true CNV or a false positive. Highly useful for diagnostic testing to filter out false positives to provide clinicians with fewer variants to interpret. SARC uniquely only used cov and csv (similiar to BED file) files which are the common CNV pipeline calling filetypes, and can be used as to supplement the Interactive Genome Browser (IGV) to generate many figures automatedly, which can be especially helpful in large cohorts with 100s-1000s of patients.

Maintained by Krutik Patel. Last updated 5 months ago.

software copynumbervariation visualization dnaseq sequencing

4.00 score 2 scripts

bioc

Doscheda:A DownStream Chemo-Proteomics Analysis Pipeline

Doscheda focuses on quantitative chemoproteomics used to determine protein interaction profiles of small molecules from whole cell or tissue lysates using Mass Spectrometry data. The package provides a shiny application to run the pipeline, several visualisations and a downloadable report of an experiment.

Maintained by Bruno Contrino. Last updated 5 months ago.

proteomics normalization preprocessing massspectrometry qualitycontrol dataimport regression

4.00 score 2 scripts

bioc

seqArchRplus:Downstream analyses of promoter sequence architectures and HTML report generation

seqArchRplus facilitates downstream analyses of promoter sequence architectures/clusters identified by seqArchR (or any other tool/method). With additional available information such as the TPM values and interquantile widths (IQWs) of the CAGE tag clusters, seqArchRplus can order the input promoter clusters by their shape (IQWs), and write the cluster information as browser/IGV track files. Provided visualizations are of two kind: per sample/stage and per cluster visualizations. Those of the first kind include: plot panels for each sample showing per cluster shape, TPM and other score distributions, sequence logos, and peak annotations. The second include per cluster chromosome-wise and strand distributions, motif occurrence heatmaps and GO term enrichments. Additionally, seqArchRplus can also generate HTML reports for easy viewing and comparison of promoter architectures between samples/stages.

Maintained by Sarvesh Nikumbh. Last updated 5 months ago.

annotation visualization reportwriting go motifannotation clustering

1 stars 4.00 score 2 scripts

mauricio1986

Rchoice:Discrete Choice (Binary, Poisson and Ordered) Models with Random Parameters

An implementation of simulated maximum likelihood method for the estimation of Binary (Probit and Logit), Ordered (Probit and Logit) and Poisson models with random parameters for cross-sectional and longitudinal data as presented in Sarrias (2016) <doi:10.18637/jss.v074.i10>.

Maintained by Mauricio Sarrias. Last updated 2 years ago.

4 stars 3.98 score 42 scripts

onofriandreapg

drcSeedGerm:Utilities for Data Analyses in Seed Germination/Emergence Assays

Utility functions to be used to analyse datasets obtained from seed germination/emergence assays. Fits several types of seed germination/emergence models, including those reported in Onofri et al. (2018) "Hydrothermal-time-to-event models for seed germination", European Journal of Agronomy, 101, 129-139 <doi:10.1016/j.eja.2018.08.011>. Contains several datasets for practicing.

Maintained by Andrea Onofri. Last updated 3 months ago.

nonlinear-regression seed-germination-assays time-to-event

5 stars 3.97 score 37 scripts

frederic-santos

AnthropMMD:An R Package for the Mean Measure of Divergence (MMD)

Offers a graphical user interface for the calculation of the mean measure of divergence, with facilities for trait selection and graphical representations <doi:10.1002/ajpa.23336>.

Maintained by Frédéric Santos. Last updated 1 years ago.

3.90 score 16 scripts

bioc

CONFESS:Cell OrderiNg by FluorEScence Signal

Single Cell Fluidigm Spot Detector.

Maintained by Diana LOW. Last updated 5 months ago.

immunooncology geneexpression dataimport cellbiology clustering rnaseq qualitycontrol visualization timecourse regression classification

3.90 score 2 scripts

bioc

CPSM:CPSM: Cancer patient survival model

The CPSM package provides a comprehensive computational pipeline for predicting the survival probability of cancer patients. It offers a series of steps including data processing, splitting data into training and test subsets, and normalization of data. The package enables the selection of significant features based on univariate survival analysis and generates a LASSO prognostic index score. It supports the development of predictive models for survival probability using various features and provides visualization tools to draw survival curves based on predicted survival probabilities. Additionally, SPM includes functionalities for generating bar plots that depict the predicted mean and median survival times of patients, making it a versatile tool for survival analysis in cancer research.

Maintained by Harpreet Kaur. Last updated 22 days ago.

geneexpression normalization survival

3.90 score

lassehjort

cuRe:Parametric Cure Model Estimation

Contains functions for estimating generalized parametric mixture and non-mixture cure models, loss of lifetime, mean residual lifetime, and crude event probabilities.

Maintained by Lasse Hjort Jakobsen. Last updated 2 years ago.

9 stars 3.90 score 22 scripts

r-forge

smacofx:Flexible Multidimensional Scaling and 'smacof' Extensions

Flexible multidimensional scaling (MDS) methods and extensions to the package 'smacof'. This package contains various functions, wrappers, methods and classes for fitting, plotting and displaying a large number of different flexible MDS models. These are: Torgerson scaling (Torgerson, 1958, ISBN:978-0471879459) with powers, Sammon mapping (Sammon, 1969, <doi:10.1109/T-C.1969.222678>) with ratio and interval optimal scaling, Multiscale MDS (Ramsay, 1977, <doi:10.1007/BF02294052>) with ratio and interval optimal scaling, s-stress MDS (ALSCAL; Takane, Young & De Leeuw, 1977, <doi:10.1007/BF02293745>) with ratio and interval optimal scaling, elastic scaling (McGee, 1966, <doi:10.1111/j.2044-8317.1966.tb00367.x>) with ratio and interval optimal scaling, r-stress MDS (De Leeuw, Groenen & Mair, 2016, <https://rpubs.com/deleeuw/142619>) with ratio, interval, splines and nonmetric optimal scaling, power-stress MDS (POST-MDS; Buja & Swayne, 2002 <doi:10.1007/s00357-001-0031-0>) with ratio and interval optimal scaling, restricted power-stress (Rusch, Mair & Hornik, 2021, <doi:10.1080/10618600.2020.1869027>) with ratio and interval optimal scaling, approximate power-stress with ratio optimal scaling (Rusch, Mair & Hornik, 2021, <doi:10.1080/10618600.2020.1869027>), Box-Cox MDS (Chen & Buja, 2013, <https://jmlr.org/papers/v14/chen13a.html>), local MDS (Chen & Buja, 2009, <doi:10.1198/jasa.2009.0111>), curvilinear component analysis (Demartines & Herault, 1997, <doi:10.1109/72.554199>), curvilinear distance analysis (Lee, Lendasse & Verleysen, 2004, <doi:10.1016/j.neucom.2004.01.007>), nonlinear MDS with optimal dissimilarity powers functions (De Leeuw, 2024, <https://github.com/deleeuw/smacofManual/blob/main/smacofPO/smacofPO.pdf>), sparsified (power) MDS and sparsified multidimensional (power) distance analysis (Rusch, 2024, <doi:10.57938/355bf835-ddb7-42f4-8b85-129799fc240e>). Some functions are suitably flexible to allow any other sensible combination of explicit power transformations for weights, distances and input proximities with implicit ratio, interval, splines or nonmetric optimal scaling of the input proximities. Most functions use a Majorization-Minimization algorithm. Currently the methods are only available for one-mode data (symmetric dissimilarity matrices).

Maintained by Thomas Rusch. Last updated 3 months ago.

1 stars 3.89 score 2 dependents

nzilbb

nzilbb.vowels:Vowel Covariation Tools

Tools to support research on vowel covariation. Methods are provided to support Principal Component Analysis workflows (as in Brand et al. (2021) <doi:10.1016/j.wocn.2021.101096> and Wilson Black et al. (2023) <doi:10.1515/lingvan-2022-0086>).

Maintained by Joshua Wilson Black. Last updated 4 months ago.

3.88 score 15 scripts

swfsc

CruzPlot:Plot Shipboard DAS Data

A utility program oriented to create maps, plot data, and do basic data summaries of DAS data files. These files are typically, but do not have to be DAS <https://swfsc-publications.fisheries.noaa.gov/publications/TM/SWFSC/NOAA-TM-NMFS-SWFSC-305.PDF> data produced by the Southwest Fisheries Science Center (SWFSC) program 'WinCruz'.

Maintained by Sam Woodman. Last updated 6 months ago.

jags cpp

2 stars 3.85 score 2 scripts

aqlt

rjdqa:Quality Assessment for Seasonal Adjustment

Add-in to the 'RJDemetra' package on seasonal adjustments. It allows to produce dashboards to summarise models and quickly check the quality of the seasonal adjustment.

Maintained by Alain Quartier-la-Tente. Last updated 5 months ago.

jdemetra quality-assessment openjdk

2 stars 3.85 score 8 scripts

kwb-r

kwb.wtaq:Interface to WTAQ Drawdown Model (http://water.usgs.gov/ogw/wtaq/)

Functions enabling the writing of WTAQ input files, running of WTAQ and reading of WTAQ output files.

Maintained by Hauke Sonnenberg. Last updated 3 years ago.

drawdown-model groundwater-modelling modelling project-optiwells2 shiny-app usgs wtaq fortran

2 stars 3.78 score 9 scripts 1 dependents

berriez

asymmetry:Multidimensional Scaling of Asymmetric Proximities

Multidimensional scaling models and methods for the visualization and analysis of asymmetric proximity data<doi:10.1111/j.2044-8317.1996.tb01078.x>. An asymmetric data matrix has the same number of rows and columns, and these rows and columns refer to the same set of objects. At least some elements in the upper-triangle are different from the corresponding elements in the lower triangle. An example of an asymmetric matrix is a student migration table, where the rows correspond to the countries of origin of the students and the columns to the destination countries. This package provides algorithms for three multidimensional scaling models. These are the slide-vector model<doi:10.1007/BF02294474>, a scaling model with unique dimensions and the asymscal model for asymmetric multidimensional scaling. Furthermore, a heat map for skew-symmetric data, and the decomposition of asymmetry are provided for the exploratory analysis of asymmetric tables.

Maintained by Berrie Zielman. Last updated 10 months ago.

3.78 score 12 scripts

xinkaidupsy

IVPP:Invariance Partial Pruning Test

An implementation of the Invariance Partial Pruning (IVPP) approach described in Du, X., Johnson, S. U., Epskamp, S. (2025) The Invariance Partial Pruning Approach to The Network Comparison in Longitudinal Data. IVPP is a two-step method that first test for global network structural difference with invariance test and then inspect specific edge difference with partial pruning.

Maintained by Xinkai Du. Last updated 2 days ago.

3.78 score 7 scripts

magosil86

getspres:SPRE Statistics for Exploring Heterogeneity in Meta-Analysis

An implementation of SPRE (standardised predicted random-effects) statistics in R to explore heterogeneity in genetic association meta- analyses, as described by Magosi et al. (2019) <doi:10.1093/bioinformatics/btz590>. SPRE statistics are precision weighted residuals that indicate the direction and extent with which individual study-effects in a meta-analysis deviate from the average genetic effect. Overly influential positive outliers have the potential to inflate average genetic effects in a meta-analysis whilst negative outliers might lower or change the direction of effect. See the 'getspres' website for documentation and examples <https://magosil86.github.io/getspres/>.

Maintained by Lerato E Magosi. Last updated 4 years ago.

3.70 score 9 scripts

yaziciceyda

cmaRs:Implementation of the Conic Multivariate Adaptive Regression Splines in R

An implementation of 'Conic Multivariate Adaptive Regression Splines (CMARS)' in R. See Weber et al. (2011) CMARS: a new contribution to nonparametric regression with multivariate adaptive regression splines supported by continuous optimization, <DOI:10.1080/17415977.2011.624770>. It constructs models by using the terms obtained from the forward step of MARS and then estimates parameters by using 'Tikhonov' regularization and conic quadratic optimization. It is possible to construct models for prediction and binary classification. It provides performance measures for the model developed. The package needs the optimisation software 'MOSEK' <https://www.mosek.com/> to construct the models. Please follow the instructions in 'Rmosek' for the installation.

Maintained by Ceyda Yazici. Last updated 2 years ago.

3.70 score 2 scripts

chabert-liddell

robber:Using Block Model to Estimate the Robustness of Ecological Network

Implementation of a variety of methods to compute the robustness of ecological interaction networks with binary interactions as described in <doi:10.1002/env.2709>. In particular, using the Stochastic Block Model and its bipartite counterpart, the Latent Block Model to put a parametric model on the network, allows the comparison of the robustness of networks differing in species richness and number of interactions. It also deals with networks that are partially sampled and/or with missing values.

Maintained by Saint-Clair Chabert-Liddell. Last updated 1 years ago.

ecological-network robber robustness

1 stars 3.70 score 4 scripts

grunwaldlab

ezec:Easy Interface to Effective Concentration Calculations

Because fungicide resistance is an important phenotypic trait for fungi and oomycetes, it is necessary to have a standardized method of statistically analyzing the Effective Concentration (EC) values. This package is designed for those who are not terribly familiar with R to be able to analyze and plot an entire set of isolates using the 'drc' package.

Maintained by Zhian N. Kamvar. Last updated 9 years ago.

1 stars 3.70 score 6 scripts

nicavan

dexisensitivity:'DEXi' Decision Tree Analysis and Visualization

Provides a versatile toolkit for analyzing and visualizing 'DEXi' (Decision EXpert for education) decision trees, facilitating multi-criteria decision analysis directly within R. Users can read .dxi files, manipulate decision trees, and evaluate various scenarios. It supports sensitivity analysis through Monte Carlo simulations, one-at-a-time approaches, and variance-based methods, helping to discern the impact of input variations. Additionally, it includes functionalities for generating sampling plans and an array of visualization options for decision trees and analysis results. A distinctive feature is the synoptic table plot, aiding in the efficient comparison of scenarios. Whether for in-depth decision modeling or sensitivity analysis, this package stands as a comprehensive solution. Definition of sensitivity analyses available in Carpani, Bergez and Monod (2012) <doi:10.1016/j.envsoft.2011.10.002> and detailed description of the package soon available in Alaphilippe, Allart, Carpani, Cavan, Monod and Bergez (submitted to Software Impacts).

Maintained by Nicolas Cavan. Last updated 5 months ago.

3.70 score

benst099

circlesplot:Visualize Proportions with Circles in a Plot

Method for visualizing proportions between objects of different sizes. The proportions are drawn as circles with different diameters, which makes them ideal for visualizing proportions between planets.

Maintained by BenSt099. Last updated 1 years ago.

data-science data-visualization proportions visualization

3.70 score 2 scripts

jfrench

smacpod:Statistical Methods for the Analysis of Case-Control Point Data

Statistical methods for analyzing case-control point data. Methods include the ratio of kernel densities, the difference in K Functions, the spatial scan statistic, and q nearest neighbors of cases.

Maintained by Joshua French. Last updated 5 months ago.

3.69 score 49 scripts

shanpengli

JMH:Joint Model of Heterogeneous Repeated Measures and Survival Data

Maximum likelihood estimation for the semi-parametric joint modeling of competing risks and longitudinal data in the presence of heterogeneous within-subject variability, proposed by Li and colleagues (2023) <arXiv:2301.06584>. The proposed method models the within-subject variability of the biomarker and associates it with the risk of the competing risks event. The time-to-event data is modeled using a (cause-specific) Cox proportional hazards regression model with time-fixed covariates. The longitudinal outcome is modeled using a mixed-effects location and scale model. The association is captured by shared random effects. The model is estimated using an Expectation Maximization algorithm.

Maintained by Shanpeng Li. Last updated 2 months ago.

cpp

3 stars 3.65 score 4 scripts

mihaiconstantin

powerly:Sample Size Analysis for Psychological Networks and More

An implementation of the sample size computation method for network models proposed by Constantin et al. (2021) <doi:10.31234/osf.io/j5v7u>. The implementation takes the form of a three-step recursive algorithm designed to find an optimal sample size given a model specification and a performance measure of interest. It starts with a Monte Carlo simulation step for computing the performance measure and a statistic at various sample sizes selected from an initial sample size range. It continues with a monotone curve-fitting step for interpolating the statistic across the entire sample size range. The final step employs stratified bootstrapping to quantify the uncertainty around the fitted curve.

Maintained by Mihai Constantin. Last updated 2 years ago.

network-models power-analysis psychology sample-size-calculation

9 stars 3.65 score 3 scripts

ritoban1

EHRmuse:Multi-Cohort Selection Bias Correction using IPW and AIPW Methods

Comprehensive toolkit for addressing selection bias in binary disease models across diverse non-probability samples, each with unique selection mechanisms. It utilizes Inverse Probability Weighting (IPW) and Augmented Inverse Probability Weighting (AIPW) methods to reduce selection bias effectively in multiple non-probability cohorts by integrating data from either individual-level or summary-level external sources. The package also provides a variety of variance estimation techniques. Please refer to Kundu et al. <doi:10.48550/arXiv.2412.00228>.

Maintained by Michael Kleinsasser. Last updated 2 months ago.

3.65 score

bioc

segmenter:Perform Chromatin Segmentation Analysis in R by Calling ChromHMM

Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.

Maintained by Mahmoud Ahmed. Last updated 5 months ago.

software histonemodification bioconductor chromhmm segmentation-an

4 stars 3.60 score 9 scripts

rkbauer

RchivalTag:Analyzing and Interactive Visualization of Archival Tagging Data

A set of functions to generate, access and analyze standard data products from archival tagging data.

Maintained by Robert K. Bauer. Last updated 2 months ago.

data-visuali depth depth-temperature-profiles dygraphs ggpot leaflet minipat pelagic plotly satellite sensor spatial star-oddi temperature time-series tracks wildlife-computers

1 stars 3.59 score 26 scripts

cogdisreslab

PAVER:PAVER: Pathway Analysis Visualization with Embedding Representations

Summary visualization using embedding representations to reveal underlying themes within sets of pathway terms.

Maintained by William G Ryan V. Last updated 8 months ago.

3.48 score 6 scripts

eleanorcaves

AcuityView:A Package for Displaying Visual Scenes as They May Appear to an Animal with Lower Acuity

This code provides a simple method for representing a visual scene as it may be seen by an animal with less acute vision. When using (or for more information), please cite the original publication.

Maintained by Eleanor Caves. Last updated 8 years ago.

3 stars 3.48 score 1 scripts

cnrakt

haplotypes:Manipulating DNA Sequences and Estimating Unambiguous Haplotype Network with Statistical Parsimony

Provides S4 classes and methods for reading and manipulating aligned DNA sequences, supporting an indel coding methods (only simple indel coding method is available in the current version), showing base substitutions and indels, calculating absolute pairwise distances between DNA sequences, and collapses identical DNA sequences into haplotypes or inferring haplotypes using user provided absolute pairwise character difference matrix. This package also includes S4 classes and methods for estimating genealogical relationships among haplotypes using statistical parsimony and plotting parsimony networks.

Maintained by Caner Aktas. Last updated 2 years ago.

1 stars 3.43 score 54 scripts

jackmwolf

tehtuner:Fit and Tune Models to Detect Treatment Effect Heterogeneity

Implements methods to fit Virtual Twins models (Foster et al. (2011) <doi:10.1002/sim.4322>) for identifying subgroups with differential effects in the context of clinical trials while controlling the probability of falsely detecting a differential effect when the conditional average treatment effect is uniform across the study population using parameter selection methods proposed in Wolf et al. (2022) <doi:10.1177/17407745221095855>.

Maintained by Jack Wolf. Last updated 2 years ago.

clinical-trials heterogeneity-of-treatment-effect subgroup-identification

5 stars 3.40 score 6 scripts

cran

randomizeR:Randomization for Clinical Trials

This tool enables the user to choose a randomization procedure based on sound scientific criteria. It comprises the generation of randomization sequences as well the assessment of randomization procedures based on carefully selected criteria. Furthermore, 'randomizeR' provides a function for the comparison of randomization procedures.

Maintained by Ralf-Dieter Hilgers. Last updated 2 years ago.

2 stars 3.38 score 1 dependents

penncil

xmeta:A Toolbox for Multivariate Meta-Analysis

A toolbox for meta-analysis. This package includes: 1,a robust multivariate meta-analysis of continuous or binary outcomes; 2, a bivariate Egger's test for detecting small study effects; 3, Galaxy Plot: A New Visualization Tool of Bivariate Meta-Analysis Studies; 4, a bivariate T&F method accounting for publication bias in bivariate meta-analysis, based on symmetry of the galaxy plot. Hong C. et al(2020) <doi:10.1093/aje/kwz286>, Chongliang L. et al(2020) <doi:10.1101/2020.07.27.20161562>; 5, a method for Composite Likelihood Network Meta-Analysis without knowledge of within-study variance and accounting for small sample effect sizes.

Maintained by Jiajie Chen. Last updated 10 months ago.

3 stars 3.38 score 9 scripts

statleila

priorityelasticnet:Comprehensive Analysis of Multi-Omics Data Using an Offset-Based Method

Priority-ElasticNet extends the Priority-LASSO method (Klau et al. (2018) <doi:10.1186/s12859-018-2344-6>) by incorporating the ElasticNet penalty, allowing for both L1 and L2 regularization. This approach fits successive ElasticNet models for several blocks of (omics) data with different priorities, using the predicted values from each block as an offset for the subsequent block. It also offers robust options to handle block-wise missingness in multi-omics data, improving the flexibility and applicability of the model in the presence of incomplete datasets.

Maintained by Laila Qadir Musib. Last updated 2 months ago.

3.36 score

lefeup

BoSSA:A Bunch of Structure and Sequence Analysis

Reads and plots phylogenetic placements.

Maintained by Pierre Lefeuvre. Last updated 4 years ago.

3.35 score 15 scripts

kaiaragaki

ezmtt:Easy MTT Assay Tidying and Plotting

This package automates the analysis and plotting of standard MTT workflows.

Maintained by Kai Aragaki. Last updated 5 months ago.

3.30 score 3 scripts