Showing 16 of total 16 results (show query)
bioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
186 stars 9.70 score 126 scripts 1 dependentsadibender
pammtools:Piece-Wise Exponential Additive Mixed Modeling Tools for Survival Analysis
The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated data as well as competing risks, recurrent events and multi-state settings. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.
Maintained by Andreas Bender. Last updated 8 days ago.
additive-modelspammpammtoolspiece-wise-exponentialsurvival-analysis
48 stars 9.32 score 310 scripts 8 dependentsalexpkeil1
qgcomp:Quantile G-Computation
G-computation for a set of time-fixed exposures with quantile-based basis functions, possibly under linearity and homogeneity assumptions. This approach estimates a regression line corresponding to the expected change in the outcome (on the link basis) given a simultaneous increase in the quantile-based category for all exposures. Works with continuous, binary, and right-censored time-to-event outcomes. Reference: Alexander P. Keil, Jessie P. Buckley, Katie M. OBrien, Kelly K. Ferguson, Shanshan Zhao, and Alexandra J. White (2019) A quantile-based g-computation approach to addressing the effects of exposure mixtures; <doi:10.1289/EHP5838>.
Maintained by Alexander Keil. Last updated 5 days ago.
exposureexposure-mixtureexposure-mixturesquantile-gcomputationsurvival
37 stars 8.72 score 70 scripts 2 dependentsnliulab
AutoScore:An Interpretable Machine Learning-Based Automatic Clinical Score Generator
A novel interpretable machine learning-based framework to automate the development of a clinical scoring model for predefined outcomes. Our novel framework consists of six modules: variable ranking with machine learning, variable transformation, score derivation, model selection, domain knowledge-based score fine-tuning, and performance evaluation.The details are described in our research paper<doi:10.2196/21798>. Users or clinicians could seamlessly generate parsimonious sparse-score risk models (i.e., risk scores), which can be easily implemented and validated in clinical practice. We hope to see its application in various medical case studies.
Maintained by Feng Xie. Last updated 30 days ago.
32 stars 7.58 score 30 scriptsbioc
structToolbox:Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Maintained by Gavin Rhys Lloyd. Last updated 1 months ago.
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
10 stars 6.26 score 12 scriptsjtimonen
lgpr:Longitudinal Gaussian Process Regression
Interpretable nonparametric modeling of longitudinal data using additive Gaussian process regression. Contains functionality for inferring covariate effects and assessing covariate relevances. Models are specified using a convenient formula syntax, and can include shared, group-specific, non-stationary, heterogeneous and temporally uncertain effects. Bayesian inference for model parameters is performed using 'Stan'. The modeling approach and methods are described in detail in Timonen et al. (2021) <doi:10.1093/bioinformatics/btab021>.
Maintained by Juho Timonen. Last updated 7 months ago.
bayesian-inferencegaussian-processeslongitudinal-datastancpp
25 stars 5.94 score 69 scriptsmpiktas
midasr:Mixed Data Sampling Regression
Methods and tools for mixed frequency time series data analysis. Allows estimation, model selection and forecasting for MIDAS regressions.
Maintained by Vaidotas Zemlys-Balevičius. Last updated 3 years ago.
77 stars 5.76 score 150 scriptslonghaisk
HTLR:Bayesian Logistic Regression with Heavy-Tailed Priors
Efficient Bayesian multinomial logistic regression based on heavy-tailed (hyper-LASSO, non-convex) priors. The posterior of coefficients and hyper-parameters is sampled with restricted Gibbs sampling for leveraging the high-dimensionality and Hamiltonian Monte Carlo for handling the high-correlation among coefficients. A detailed description of the method: Li and Yao (2018), Journal of Statistical Computation and Simulation, 88:14, 2827-2851, <arXiv:1405.3319>.
Maintained by Longhai Li. Last updated 5 months ago.
bayesianclassificationhigh-dimensional-datamachine-learningmcmcopenblascppopenmp
10 stars 5.18 score 7 scriptscaranathunge
promor:Proteomics Data Analysis and Modeling Tools
A comprehensive, user-friendly package for label-free proteomics data analysis and machine learning-based modeling. Data generated from 'MaxQuant' can be easily used to conduct differential expression analysis, build predictive models with top protein candidates, and assess model performance. promor includes a suite of tools for quality control, visualization, missing data imputation (Lazar et. al. (2016) <doi:10.1021/acs.jproteome.5b00981>), differential expression analysis (Ritchie et. al. (2015) <doi:10.1093/nar/gkv007>), and machine learning-based modeling (Kuhn (2008) <doi:10.18637/jss.v028.i05>).
Maintained by Chathurani Ranathunge. Last updated 2 years ago.
biomarkersdifferential-expressionlfqmachine-learningmass-spectrometrymodelingproteomics
15 stars 5.02 score 14 scriptsdzhakparov
GeneSelectR:Comprehensive Feature Selection Worfkflow for Bulk RNAseq Datasets
GeneSelectR is a versatile R package designed for efficient RNA sequencing data analysis. Its key innovation lies in the seamless integration of the Python sklearn machine learning framework with R-based bioinformatics tools. This integration enables GeneSelectR to perform robust ML-driven feature selection while simultaneously leveraging the power of Gene Ontology (GO) enrichment and semantic similarity analyses. By combining these diverse methodologies, GeneSelectR offers a comprehensive workflow that optimizes both the computational aspects of ML and the biological insights afforded by advanced bioinformatics analyses. Ideal for researchers in bioinformatics, GeneSelectR stands out as a unique tool for analyzing complex RNAseq datasets with enhanced precision and relevance.
Maintained by Damir Zhakparov. Last updated 10 months ago.
19 stars 4.98 score 7 scriptscoursekata
coursekata:Packages and Functions for 'CourseKata' Courses
Easily install and load all packages and functions used in 'CourseKata' courses. Aid teaching with helper functions and augment generic functions to provide cohesion between the network of packages. Learn more about 'CourseKata' at <https://coursekata.org>.
Maintained by Adam Blake. Last updated 4 months ago.
4 stars 4.68 score 60 scriptsrpkgs
rtrend:Trend Estimating Tools
The traditional linear regression trend, Modified Mann-Kendall (MK) non-parameter trend and bootstrap trend are included in this package. Linear regression trend is rewritten by '.lm.fit'. MK trend is rewritten by 'Rcpp'. Finally, those functions are about 10 times faster than previous version in R. Reference: Hamed, K. H., & Rao, A. R. (1998). A modified Mann-Kendall trend test for autocorrelated data. Journal of hydrology, 204(1-4), 182-196. <doi:10.1016/S0022-1694(97)00125-X>.
Maintained by Dongdong Kong. Last updated 1 years ago.
3 stars 4.56 score 24 scriptsawamaeva
trajmsm:Marginal Structural Models with Latent Class Growth Analysis of Treatment Trajectories
Implements marginal structural models combined with a latent class growth analysis framework for assessing the causal effect of treatment trajectories. Based on the approach described in "Marginal Structural Models with Latent Class Growth Analysis of Treatment Trajectories" Diop, A., Sirois, C., Guertin, J.R., Schnitzer, M.E., Candas, B., Cossette, B., Poirier, P., Brophy, J., Mésidor, M., Blais, C. and Hamel, D., (2023) <doi:10.1177/09622802231202384>.
Maintained by Awa Diop. Last updated 1 years ago.
g-computationinverse-probability-weightsmarginal-structural-modelstmletrajectory-analysis
5 stars 3.40 scorejamesliley
OptHoldoutSize:Estimation of Optimal Size for a Holdout Set for Updating a Predictive Score
Predictive scores must be updated with care, because actions taken on the basis of existing risk scores causes bias in risk estimates from the updated score. A holdout set is a straightforward way to manage this problem: a proportion of the population is 'held-out' from computation of the previous risk score. This package provides tools to estimate a size for this holdout set and associated errors. Comprehensive vignettes are included. Please see: Haidar-Wehbe S, Emerson SR, Aslett LJM, Liley J (2022) <arXiv:2202.06374> for details of methods.
Maintained by James Liley. Last updated 3 years ago.
3.18 score 10 scriptsjackdunnnz
iai:Interface to 'Interpretable AI' Modules
An interface to the algorithms of 'Interpretable AI' <https://www.interpretable.ai> from the R programming language. 'Interpretable AI' provides various modules, including 'Optimal Trees' for classification, regression, prescription and survival analysis, 'Optimal Imputation' for missing data imputation and outlier detection, and 'Optimal Feature Selection' for exact sparse regression. The 'iai' package is an open-source project. The 'Interpretable AI' software modules are proprietary products, but free academic and evaluation licenses are available.
Maintained by Jack Dunn. Last updated 6 months ago.
1 stars 2.00 score 7 scripts