Showing 200 of total 991 results (show query)
ropensci
targets:Dynamic Function-Oriented 'Make'-Like Declarative Pipelines
Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).
Maintained by William Michael Landau. Last updated 14 hours ago.
data-sciencehigh-performance-computingmakepeer-reviewedpipeliner-targetopiareproducibilityreproducible-researchtargetsworkflow
205.4 match 973 stars 15.20 score 4.6k scripts 22 dependentsropensci
drake:A Pipeline Toolkit for Reproducible Computation at Scale
A general-purpose computational engine for data analysis, drake rebuilds intermediate data objects when their dependencies change, and it skips work when the results are already up to date. Not every execution starts from scratch, there is native support for parallel and distributed computing, and completed projects have tangible evidence that they are reproducible. Extensive documentation, from beginner-friendly tutorials to practical examples and more, is available at the reference website <https://docs.ropensci.org/drake/> and the online manual <https://books.ropensci.org/drake/>.
Maintained by William Michael Landau. Last updated 3 months ago.
data-sciencedrakehigh-performance-computingmakefilepeer-reviewedpipelinereproducibilityreproducible-researchropensciworkflow
59.2 match 1.3k stars 11.49 score 1.7k scripts 1 dependentsrstudio
gt:Easily Create Presentation-Ready Display Tables
Build display tables from tabular data with an easy-to-use set of functions. With its progressive approach, we can construct display tables with a cohesive set of table parts. Table values can be formatted using any of the included formatting functions. Footnotes and cell styles can be precisely added through a location targeting system. The way in which 'gt' handles things for you means that you don't often have to worry about the fine details.
Maintained by Richard Iannone. Last updated 10 days ago.
docxeasy-to-usehtmllatexrtfsummary-tables
32.3 match 2.1k stars 18.36 score 20k scripts 112 dependentsropensci
tarchetypes:Archetypes for Targets
Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'targets' R package. by Will Landau (2018) <doi:10.21105/joss.00550>.
Maintained by William Michael Landau. Last updated 19 days ago.
data-sciencehigh-performance-computingpeer-reviewedpipeliner-targetopiareproducibilitytargetsworkflow
50.6 match 141 stars 11.43 score 1.7k scripts 10 dependentsbioc
target:Predict Combined Function of Transcription Factors
Implement the BETA algorithm for infering direct target genes from DNA-binding and perturbation expression data Wang et al. (2013) <doi: 10.1038/nprot.2013.150>. Extend the algorithm to predict the combined function of two DNA-binding elements from comprable binding and expression data.
Maintained by Mahmoud Ahmed. Last updated 5 months ago.
softwarestatisticalmethodtranscriptionalgorithmchip-seqdna-bindinggene-regulationtranscription-factors
65.6 match 4 stars 7.79 score 1.3k scriptskkholst
targeted:Targeted Inference
Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).
Maintained by Klaus K. Holst. Last updated 1 months ago.
causal-inferencedouble-robustestimationsemiparametric-estimationstatisticsopenblascppopenmp
59.2 match 11 stars 7.20 score 30 scripts 1 dependentsbioc
crisprDesign:Comprehensive design of CRISPR gRNAs for nucleases and base editors
Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.
Maintained by Jean-Philippe Fortin. Last updated 10 days ago.
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomics-analysisgrnagrna-sequencegrna-sequencessgrnasgrna-design
45.8 match 22 stars 8.28 score 80 scripts 3 dependentsrolkra
explore:Simplifies Exploratory Data Analysis
Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.
Maintained by Roland Krasser. Last updated 3 months ago.
data-explorationdata-visualisationdecision-treesedarmarkdownshinytidy
31.7 match 228 stars 11.43 score 221 scripts 1 dependentsinterstellar-consultation-services
covid19dbcand:Selected 'Drugbank' Drugs for COVID-19 Treatment Related Data in R Format
Provides different datasets parsed from 'Drugbank' <https://www.drugbank.ca/covid-19> database using 'dbparser' package. It is a smaller version from 'dbdataset' package. It contains only information about COVID-19 possible treatment.
Maintained by Mohammed Ali. Last updated 11 months ago.
datasetdbparserdrugbankdrugbank-database
78.0 match 3 stars 4.48 score 6 scriptsbioc
crisprScore:On-Target and Off-Target Scoring Algorithms for CRISPR gRNAs
Provides R wrappers of several on-target and off-target scoring methods for CRISPR guide RNAs (gRNAs). The following nucleases are supported: SpCas9, AsCas12a, enAsCas12a, and RfxCas13d (CasRx). The available on-target cutting efficiency scoring methods are RuleSet1, Azimuth, DeepHF, DeepCpf1, enPAM+GB, and CRISPRscan. Both the CFD and MIT scoring methods are available for off-target specificity prediction. The package also provides a Lindel-derived score to predict the probability of a gRNA to produce indels inducing a frameshift for the Cas9 nuclease. Note that DeepHF, DeepCpf1 and enPAM+GB are not available on Windows machines.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsfunctionalpredictionbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomicsgrnagrna-sequencegrna-sequencesscoring-algorithmsgrnasgrna-design
43.4 match 16 stars 7.52 score 19 scripts 4 dependentstvganesh
QCSimulator:5 Qubit Quantum Computing Simulator
This package simulates a 5 qubit Quantum Computer.
Maintained by Tinniam V Ganesh. Last updated 9 years ago.
67.5 match 5 stars 4.20 score 64 scriptsbioc
systemPipeR:systemPipeR: Workflow Environment for Data Analysis and Report Generation
systemPipeR is a multipurpose data analysis workflow environment that unifies R with command-line tools. It enables scientists to analyze many types of large- or small-scale data on local or distributed computer systems with a high level of reproducibility, scalability and portability. At its core is a command-line interface (CLI) that adopts the Common Workflow Language (CWL). This design allows users to choose for each analysis step the optimal R or command-line software. It supports both end-to-end and partial execution of workflows with built-in restart functionalities. Efficient management of complex analysis tasks is accomplished by a flexible workflow control container class. Handling of large numbers of input samples and experimental designs is facilitated by consistent sample annotation mechanisms. As a multi-purpose workflow toolkit, systemPipeR enables users to run existing workflows, customize them or design entirely new ones while taking advantage of widely adopted data structures within the Bioconductor ecosystem. Another important core functionality is the generation of reproducible scientific analysis and technical reports. For result interpretation, systemPipeR offers a wide range of plotting functionality, while an associated Shiny App offers many useful functionalities for interactive result exploration. The vignettes linked from this page include (1) a general introduction, (2) a description of technical details, and (3) a collection of workflow templates.
Maintained by Thomas Girke. Last updated 5 months ago.
geneticsinfrastructuredataimportsequencingrnaseqriboseqchipseqmethylseqsnpgeneexpressioncoveragegenesetenrichmentalignmentqualitycontrolimmunooncologyreportwritingworkflowstepworkflowmanagement
24.1 match 53 stars 11.56 score 344 scripts 3 dependentsbioc
bioassayR:Cross-target analysis of small molecule bioactivity
bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.
Maintained by Thomas Girke. Last updated 5 months ago.
immunooncologymicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportbioinformaticsproteomicsmetabolomics
34.1 match 5 stars 6.70 score 46 scriptsropensci
geotargets:'Targets' Extensions for Geographic Spatial Formats
Provides extensions for various geographic spatial file formats, such as shape files and rasters. Currently provides support for the 'terra' geographic spatial formats. See the vignettes for worked examples, demonstrations, and explanations of how to use the various package extensions.
Maintained by Nicholas Tierney. Last updated 2 days ago.
geospatialpipeliner-targetopiarasterreproducibilityreproducible-researchtargetsvectorworkflow
31.1 match 72 stars 6.78 scorebioc
CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems
The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.
Maintained by Lihua Julie Zhu. Last updated 5 days ago.
immunooncologygeneregulationsequencematchingcrispr
27.4 match 7.18 score 51 scripts 2 dependentsprioritizr
prioritizr:Systematic Conservation Prioritization in R
Systematic conservation prioritization using mixed integer linear programming (MILP). It provides a flexible interface for building and solving conservation planning problems. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. By using exact algorithm solvers, solutions can be generated that are guaranteed to be optimal (or within a pre-specified optimality gap). Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management actions or zones, meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. To solve large-scale or complex conservation planning problems, users should install the Gurobi optimization software (available from <https://www.gurobi.com/>) and the 'gurobi' R package (see Gurobi Installation Guide vignette for details). Users can also install the IBM CPLEX software (<https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer>) and the 'cplexAPI' R package (available at <https://github.com/cran/cplexAPI>). Additionally, the 'rcbc' R package (available at <https://github.com/dirkschumacher/rcbc>) can be used to generate solutions using the CBC optimization software (<https://github.com/coin-or/Cbc>). For further details, see Hanson et al. (2025) <doi:10.1111/cobi.14376>.
Maintained by Richard Schuster. Last updated 10 days ago.
biodiversityconservationconservation-planneroptimizationprioritizationsolverspatialcpp
16.6 match 124 stars 11.82 score 584 scripts 2 dependentsbioc
RCy3:Functions to Access and Control Cytoscape
Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.
Maintained by Alex Pico. Last updated 5 months ago.
visualizationgraphandnetworkthirdpartyclientnetwork
14.1 match 52 stars 13.39 score 628 scripts 15 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 3 days ago.
13.8 match 845 stars 13.57 score 264 scripts 2 dependentsbioc
miRLAB:Dry lab for exploring miRNA-mRNA relationships
Provide tools exploring miRNA-mRNA relationships, including popular miRNA target prediction methods, ensemble methods that integrate individual methods, functions to get data from online resources, functions to validate the results, and functions to conduct enrichment analyses.
Maintained by Thuc Duy Le. Last updated 5 months ago.
mirnageneexpressionnetworkinferencenetwork
39.5 match 4.72 score 11 scriptsusaid-oha-si
tameDP:Import targets and PLHIV data from COP Target Setting Tool (formerly Data Pack)
Import PSNUxIM targets and PLHIV data from COP Data Pack. The purpose is to make the data tidy and more usable than their current structure in the Excel data packs.
Maintained by Aaron Chafetz. Last updated 1 years ago.
34.4 match 1 stars 4.92 score 46 scriptsowp-spatial
reference.fabric:Hydrological Reference Fabric Tools
Development tools and `targets` pipeline for generating a national hydrological geospatial reference fabric.
Maintained by Justin Singh-Mohudpur. Last updated 2 months ago.
56.0 match 1 stars 3.00 scorewinvector
vtreat:A Statistically Sound 'data.frame' Processor/Conditioner
A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). Reference: "'vtreat': a data.frame Processor for Predictive Modeling", Zumel, Mount, 2016, <DOI:10.5281/zenodo.1173313>.
Maintained by John Mount. Last updated 2 months ago.
categorical-variablesmachine-learning-algorithmsnested-modelsprepare-data
14.8 match 285 stars 11.19 score 328 scripts 1 dependentsmlr-org
mlr3pipelines:Preprocessing Operators and Pipelines for 'mlr3'
Dataflow programming toolkit that enriches 'mlr3' with a diverse set of pipelining operators ('PipeOps') that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can therefore be resampled, benchmarked, and tuned.
Maintained by Martin Binder. Last updated 7 days ago.
baggingdata-sciencedataflow-programmingensemble-learningmachine-learningmlr3pipelinespreprocessingstacking
13.0 match 141 stars 12.36 score 448 scripts 7 dependentst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
14.1 match 10.82 score 10k scripts 54 dependentsflr
FLasher:Projection and Forecasting of Fish Populations, Stocks and Fleets
Projection of future population and fishery dynamics is carried out for a given set of management targets. A system of equations is solved, using Automatic Differentation (AD), for the levels of effort by fishery (fleet) that will result in the required abundances, catches or fishing mortalities.
Maintained by Iago Mosqueira. Last updated 8 days ago.
22.0 match 2 stars 6.86 score 254 scripts 6 dependentsbioc
OmnipathR:OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Maintained by Denes Turei. Last updated 17 days ago.
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
14.4 match 126 stars 9.90 score 226 scripts 2 dependentschoonghyunryu
dlookr:Tools for Data Diagnosis, Exploration, Transformation
A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values, outliers, and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and the relationship between the target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputes missing values and outliers, and resolves skewness. And it creates automated reports that support these three tasks.
Maintained by Choonghyun Ryu. Last updated 9 months ago.
12.4 match 212 stars 11.05 score 748 scripts 2 dependentsbioc
crisprViz:Visualization Functions for CRISPR gRNAs
Provides functionalities to visualize and contextualize CRISPR guide RNAs (gRNAs) on genomic tracks across nucleases and applications. Works in conjunction with the crisprBase and crisprDesign Bioconductor packages. Plots are produced using the Gviz framework.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-analysiscrispr-designgrnagrna-sequencegrna-sequencessgrnasgrna-designvisualization
20.0 match 7 stars 6.23 score 6 scripts 2 dependentspik-piam
mrremind:MadRat REMIND Input Data Package
The mrremind packages contains data preprocessing for the REMIND model.
Maintained by Lavinia Baumstark. Last updated 2 days ago.
19.8 match 4 stars 6.25 score 15 scripts 1 dependentsjeffreyhanson
raptr:Representative and Adequate Prioritization Toolkit in R
Biodiversity is in crisis. The overarching aim of conservation is to preserve biodiversity patterns and processes. To this end, protected areas are established to buffer species and preserve biodiversity processes. But resources are limited and so protected areas must be cost-effective. This package contains tools to generate plans for protected areas (prioritizations), using spatially explicit targets for biodiversity patterns and processes. To obtain solutions in a feasible amount of time, this package uses the commercial 'Gurobi' software (obtained from <https://www.gurobi.com/>). For more information on using this package, see Hanson et al. (2018) <doi:10.1111/2041-210X.12862>.
Maintained by Jeffrey O Hanson. Last updated 1 years ago.
22.0 match 8 stars 5.52 score 83 scriptsbioc
genomation:Summary, annotation and visualization of genomic data
A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.
Maintained by Altuna Akalin. Last updated 5 months ago.
annotationsequencingvisualizationcpgislandcpp
10.9 match 75 stars 11.09 score 738 scripts 5 dependentsropensci
stantargets:Targets for Stan Workflows
Bayesian data analysis usually incurs long runtimes and cumbersome custom code. A pipeline toolkit tailored to Bayesian statisticians, the 'stantargets' R package leverages 'targets' and 'cmdstanr' to ease these burdens. 'stantargets' makes it super easy to set up scalable Stan pipelines that automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom code is required, and there is no need to manually configure branching, so usage is much easier than 'targets' alone. 'stantargets' can access all of 'cmdstanr''s major algorithms (MCMC, variational Bayes, and optimization) and it supports both single-fit workflows and multi-rep simulation studies. For the statistical methodology, please refer to 'Stan' documentation (Stan Development Team 2020) <https://mc-stan.org/>.
Maintained by William Michael Landau. Last updated 1 months ago.
bayesianhigh-performance-computingmaker-targetopiareproducibilitystanstatisticstargets
17.5 match 49 stars 6.85 score 180 scriptsbioc
IntramiRExploreR:Predicting Targets for Drosophila Intragenic miRNAs
Intra-miR-ExploreR, an integrative miRNA target prediction bioinformatics tool, identifies targets combining expression and biophysical interactions of a given microRNA (miR). Using the tool, we have identified targets for 92 intragenic miRs in D. melanogaster, using available microarray expression data, from Affymetrix 1 and Affymetrix2 microarray array platforms, providing a global perspective of intragenic miR targets in Drosophila. Predicted targets are grouped according to biological functions using the DAVID Gene Ontology tool and are ranked based on a biologically relevant scoring system, enabling the user to identify functionally relevant targets for a given miR.
Maintained by Surajit Bhattacharya. Last updated 5 months ago.
softwaremicroarraygenetargetstatisticalmethodgeneexpressiongeneprediction
25.0 match 4.60 score 4 scriptsbioc
crisprBase:Base functions and classes for CRISPR gRNA design
Provides S4 classes for general nucleases, CRISPR nucleases, CRISPR nickases, and base editors.Several CRISPR-specific genome arithmetic functions are implemented to help extract genomic coordinates of spacer and protospacer sequences. Commonly-used CRISPR nuclease objects are provided that can be readily used in other packages. Both DNA- and RNA-targeting nucleases are supported.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequences
15.9 match 5 stars 7.15 score 52 scripts 6 dependentsropensci
jagstargets:Targets for JAGS Pipelines
Bayesian data analysis usually incurs long runtimes and cumbersome custom code. A pipeline toolkit tailored to Bayesian statisticians, the 'jagstargets' R package is leverages 'targets' and 'R2jags' to ease this burden. 'jagstargets' makes it super easy to set up scalable JAGS pipelines that automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom code is required, and there is no need to manually configure branching, so usage is much easier than 'targets' alone. For the underlying methodology, please refer to the documentation of 'targets' <doi:10.21105/joss.02959> and 'JAGS' (Plummer 2003) <https://www.r-project.org/conferences/DSC-2003/Proceedings/Plummer.pdf>.
Maintained by William Michael Landau. Last updated 3 months ago.
bayesianhigh-performance-computingjagsmaker-targetopiareproducibilityrjagsstatisticstargetscpp
16.1 match 10 stars 7.01 score 32 scriptstlverse
tmle3:The Extensible TMLE Framework
A general framework supporting the implementation of targeted maximum likelihood estimators (TMLEs) of a diverse range of statistical target parameters through a unified interface. The goal is that the exposed framework be as general as the mathematical framework upon which it draws.
Maintained by Jeremy Coyle. Last updated 4 months ago.
causal-inferencemachine-learningtargeted-learningvariable-importance
13.8 match 38 stars 7.91 score 286 scripts 5 dependentstlverse
tmle3shift:Targeted Learning of the Causal Effects of Stochastic Interventions
Targeted maximum likelihood estimation (TMLE) of population-level causal effects under stochastic treatment regimes and related nonparametric variable importance analyses. Tools are provided for TML estimation of the counterfactual mean under a stochastic intervention characterized as a modified treatment policy, such as treatment policies that shift the natural value of the exposure. The causal parameter and estimation were described in Dรญaz and van der Laan (2013) <doi:10.1111/j.1541-0420.2011.01685.x> and an improved estimation approach was given by Dรญaz and van der Laan (2018) <doi:10.1007/978-3-319-65304-4_14>.
Maintained by Nima Hejazi. Last updated 6 months ago.
causal-inferencemachine-learningmarginal-structural-modelsstochastic-interventionstargeted-learningtreatment-effectsvariable-importance
20.4 match 17 stars 5.33 score 42 scripts 1 dependentsropensci
gittargets:Data Version Control for the Targets Package
In computationally demanding data analysis pipelines, the 'targets' R package (2021, <doi:10.21105/joss.02959>) maintains an up-to-date set of results while skipping tasks that do not need to rerun. This process increases speed and increases trust in the final end product. However, it also overwrites old output with new output, and past results disappear by default. To preserve historical output, the 'gittargets' package captures version-controlled snapshots of the data store, and each snapshot links to the underlying commit of the source code. That way, when the user rolls back the code to a previous branch or commit, 'gittargets' can recover the data contemporaneous with that commit so that all targets remain up to date.
Maintained by William Michael Landau. Last updated 8 months ago.
data-sciencedata-version-controldata-versioningreproducibilityreproducible-researchtargetsworkflow
17.4 match 88 stars 5.99 score 11 scriptsbioc
TEQC:Quality control for target capture experiments
Target capture experiments combine hybridization-based (in solution or on microarrays) capture and enrichment of genomic regions of interest (e.g. the exome) with high throughput sequencing of the captured DNA fragments. This package provides functionalities for assessing and visualizing the quality of the target enrichment process, like specificity and sensitivity of the capture, per-target read coverage and so on.
Maintained by Sarah Bonnin. Last updated 5 months ago.
qualitycontrolmicroarraysequencinggenetics
24.0 match 4.30 score 8 scriptsdaranzolin
sqltargets:'Targets' Extension for 'SQL' Queries
Provides an extension for 'SQL' queries as separate file within 'targets' pipelines. The shorthand creates two targets, the query file and the query result.
Maintained by David Ranzolin. Last updated 5 months ago.
18.0 match 39 stars 5.72 score 18 scriptstbep-tech
tbeptools:Data and Indicators for the Tampa Bay Estuary Program
Several functions are provided for working with Tampa Bay Estuary Program data and indicators, including the water quality report card, tidal creek assessments, Tampa Bay Nekton Index, Tampa Bay Benthic Index, seagrass transect data, habitat report card, and fecal indicator bacteria. Additional functions are provided for miscellaneous tasks, such as reference library curation.
Maintained by Marcus Beck. Last updated 8 days ago.
data-analysistampa-baytbepwater-quality
13.0 match 10 stars 7.86 score 133 scriptsbioc
limma:Linear Models for Microarray and Omics Data
Data analysis, linear models and differential expression for omics data.
Maintained by Gordon Smyth. Last updated 4 days ago.
exonarraygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentdataimportbayesianclusteringregressiontimecoursemicroarraymicrornaarraymrnamicroarrayonechannelproprietaryplatformstwochannelsequencingrnaseqbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrolbiomedicalinformaticscellbiologycheminformaticsepigeneticsfunctionalgenomicsgeneticsimmunooncologymetabolomicsproteomicssystemsbiologytranscriptomics
7.4 match 13.81 score 16k scripts 585 dependentsstatnet
ergm:Fit, Simulate and Diagnose Exponential-Family Models for Networks
An integrated set of tools to analyze and simulate networks based on exponential-family random graph models (ERGMs). 'ergm' is a part of the Statnet suite of packages for network analysis. See Hunter, Handcock, Butts, Goodreau, and Morris (2008) <doi:10.18637/jss.v024.i03> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.
Maintained by Pavel N. Krivitsky. Last updated 5 days ago.
6.4 match 100 stars 15.36 score 1.4k scripts 36 dependentsbioc
crisprShiny:Exploring curated CRISPR gRNAs via Shiny
Provides means to interactively visualize guide RNAs (gRNAs) in GuideSet objects via Shiny application. This GUI can be self-contained or as a module within a larger Shiny app. The content of the app reflects the annotations present in the passed GuideSet object, and includes intuitive tools to examine, filter, and export gRNAs, thereby making gRNA design more user-friendly.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsgenetargetguicrispr-analysiscrispr-designshiny
20.9 match 2 stars 4.48 score 8 scriptsiohprofiler
IOHanalyzer:Data Analysis Part of 'IOHprofiler'
The data analysis module for the Iterative Optimization Heuristics Profiler ('IOHprofiler'). This module provides statistical analysis methods for the benchmark data generated by optimization heuristics, which can be visualized through a web-based interface. The benchmark data is usually generated by the experimentation module, called 'IOHexperimenter'. 'IOHanalyzer' also supports the widely used 'COCO' (Comparing Continuous Optimisers) data format for benchmarking.
Maintained by Diederick Vermetten. Last updated 10 months ago.
18.1 match 24 stars 5.10 score 13 scriptsazure
azuremlsdk:Interface to the 'Azure Machine Learning' 'SDK'
Interface to the 'Azure Machine Learning' Software Development Kit ('SDK'). Data scientists can use the 'SDK' to train, deploy, automate, and manage machine learning models on the 'Azure Machine Learning' service. To learn more about 'Azure Machine Learning' visit the website: <https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml>.
Maintained by Diondra Peck. Last updated 3 years ago.
amlcomputeazureazure-machine-learningazuremldsimachine-learningrstudiosdk-r
10.3 match 106 stars 8.91 score 221 scriptsdimitri-justeau
rflsgen:Neutral Landscape Generator with Targets on Landscape Indices
Interface to the 'flsgen' neutral landscape generator <https://github.com/dimitri-justeau/flsgen>. It allows to - Generate fractal terrain; - Generate landscape structures satisfying user targets over landscape indices; - Generate landscape raster from landscape structures.
Maintained by Dimitri Justeau-Allaire. Last updated 10 months ago.
16.7 match 12 stars 5.48 score 6 scriptsdovinij
GxEprs:Genotype-by-Environment Interaction in Polygenic Score Models
A novel PRS model is introduced to enhance the prediction accuracy by utilising GxE effects. This package performs Genome Wide Association Studies (GWAS) and Genome Wide Environment Interaction Studies (GWEIS) using a discovery dataset. The package has the ability to obtain polygenic risk scores (PRSs) for a target sample. Finally it predicts the risk values of each individual in the target sample. Users have the choice of using existing models (Li et al., 2015) <doi:10.1093/annonc/mdu565>, (Pandis et al., 2013) <doi:10.1093/ejo/cjt054>, (Peyrot et al., 2018) <doi:10.1016/j.biopsych.2017.09.009> and (Song et al., 2022) <doi:10.1038/s41467-022-32407-9>, as well as newly proposed models for genomic risk prediction (refer to the URL for more details).
Maintained by Dovini Jayasinghe. Last updated 10 months ago.
27.5 match 2 stars 3.30 scorerickhelmus
patRoon:Workflows for Mass-Spectrometry Based Non-Target Analysis
Provides an easy-to-use interface to a mass spectrometry based non-target analysis workflow. Various (open-source) tools are combined which provide algorithms for extraction and grouping of features, extraction of MS and MS/MS data, automatic formula and compound annotation and grouping related features to components. In addition, various tools are provided for e.g. data preparation and cleanup, plotting results and automatic reporting.
Maintained by Rick Helmus. Last updated 9 days ago.
mass-spectrometrynon-targetcppopenjdk
14.6 match 65 stars 6.22 score 43 scriptssdctools
sdcMicro:Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation
Data from statistical agencies and other institutions are mostly confidential. This package, introduced in Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v067.i04>, can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. The theoretical basis for the methods implemented can be found in Templ (2017) <doi:10.1007/978-3-319-50272-4>. Various risk estimation and anonymization methods are included. Note that the package includes a graphical user interface published in Meindl and Templ (2019) <doi:10.3390/a12090191> that allows to use various methods of this package.
Maintained by Matthias Templ. Last updated 25 days ago.
9.1 match 83 stars 9.89 score 258 scriptsyufree
pmd:Paired Mass Distance Analysis for GC/LC-MS Based Non-Targeted Analysis and Reactomics Analysis
Paired mass distance (PMD) analysis proposed in Yu, Olkowicz and Pawliszyn (2018) <doi:10.1016/j.aca.2018.10.062> and PMD based reactomics analysis proposed in Yu and Petrick (2020) <doi:10.1038/s42004-020-00403-z> for gas/liquid chromatographyโmass spectrometry (GC/LC-MS) based non-targeted analysis. PMD analysis including GlobalStd algorithm and structure/reaction directed analysis. GlobalStd algorithm could found independent peaks in m/z-retention time profiles based on retention time hierarchical cluster analysis and frequency analysis of paired mass distances within retention time groups. Structure directed analysis could be used to find potential relationship among those independent peaks in different retention time groups based on frequency of paired mass distances. Reactomics analysis could also be performed to build PMD network, assign sources and make biomarker reaction discovery. GUIs for PMD analysis is also included as 'shiny' applications.
Maintained by Miao YU. Last updated 2 months ago.
mass-spectrometrymetabolomicsnon-target
13.4 match 10 stars 6.68 score 40 scriptscivisanalytics
civis:R Client for the 'Civis Platform API'
A convenient interface for making requests directly to the 'Civis Platform API' <https://www.civisanalytics.com/platform/>. Full documentation available 'here' <https://civisanalytics.github.io/civis-r/>.
Maintained by Peter Cooman. Last updated 2 months ago.
11.4 match 16 stars 7.84 score 144 scriptssmbc-nzp
MigConnectivity:Estimate Migratory Connectivity for Migratory Animals
Allows the user to estimate transition probabilities for migratory animals between any two phases of the annual cycle, using a variety of different data types. Also quantifies the strength of migratory connectivity (MC), a standardized metric to quantify the extent to which populations co-occur between two phases of the annual cycle. Includes functions to estimate MC and the more traditional metric of migratory connectivity strength (Mantel correlation) incorporating uncertainty from multiple sources of sampling error. For cross-species comparisons, methods are provided to estimate differences in migratory connectivity strength, incorporating uncertainty. See Cohen et al. (2018) <doi:10.1111/2041-210X.12916>, Cohen et al. (2019) <doi:10.1111/ecog.03974>, and Roberts et al. (2023) <doi:10.1002/eap.2788> for details on some of these methods.
Maintained by Jeffrey A. Hostetler. Last updated 12 months ago.
13.1 match 8 stars 6.77 score 41 scriptsbioc
DECIPHER:Tools for curating, analyzing, and manipulating biological sequences
A toolset for deciphering and managing biological sequences.
Maintained by Erik Wright. Last updated 4 days ago.
clusteringgeneticssequencingdataimportvisualizationmicroarrayqualitycontrolqpcralignmentwholegenomemicrobiomeimmunooncologygenepredictionopenmp
10.3 match 8.40 score 1.1k scripts 14 dependentsprodriguezsosa
conText:'a la Carte' on Text (ConText) Embedding Regression
A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.
Maintained by Pedro L. Rodriguez. Last updated 11 months ago.
9.2 match 104 stars 9.40 score 1.7k scriptsdpc10ster
RJafroc:Artificial Intelligence Systems and Observer Performance
Analyzing the performance of artificial intelligence (AI) systems/algorithms characterized by a 'search-and-report' strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where lesion localization information is used. A book using the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: <https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840>. Online updates to this book, which use the software, are at <https://dpc10ster.github.io/RJafrocQuickStart/>, <https://dpc10ster.github.io/RJafrocRocBook/> and at <https://dpc10ster.github.io/RJafrocFrocBook/>. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict 'proper' ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, <https://github.com/dpc10ster/WindowsJafroc>. Package functions are organized as follows. Data file related function names are preceded by 'Df', curve fitting functions by 'Fit', included data sets by 'dataset', plotting functions by 'Plot', significance testing functions by 'St', sample size related functions by 'Ss', data simulation functions by 'Simulate' and utility functions by 'Util'. Implemented are figures of merit (FOMs) for quantifying performance and functions for visualizing empirical or fitted operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via either Dorfman-Berbaum-Metz or the Obuchowski-Rockette methods. Also implemented is single modality analysis, which allows comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed modality factors and the aim is to determined performance in each modality factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification.
Maintained by Dev Chakraborty. Last updated 5 months ago.
ai-optimizationartificial-intelligence-algorithmscomputer-aided-diagnosisfroc-analysisroc-analysistarget-classificationtarget-localizationcpp
15.0 match 19 stars 5.69 score 65 scriptsbioc
multiMiR:Integration of multiple microRNA-target databases with their disease and drug associations
A collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).
Maintained by Spencer Mahaffey. Last updated 5 months ago.
mirnadatahomo_sapiens_datamus_musculus_datarattus_norvegicus_dataorganismdatamicrorna-sequencesql
10.0 match 20 stars 8.45 score 141 scriptsbioc
XNAString:Efficient Manipulation of Modified Oligonucleotide Sequences
The XNAString package allows for description of base sequences and associated chemical modifications in a single object. XNAString is able to capture single stranded, as well as double stranded molecules. Chemical modifications are represented as independent strings associated with different features of the molecules (base sequence, sugar sequence, backbone sequence, modifications) and can be read or written to a HELM notation. It also enables secondary structure prediction using RNAfold from ViennaRNA. XNAString is designed to be efficient representation of nucleic-acid based therapeutics, therefore it stores information about target sequences and provides interface for matching and alignment functions from Biostrings and pwalign packages.
Maintained by Marianna Plucinska. Last updated 5 months ago.
sequencematchingalignmentsequencinggeneticscpp
20.0 match 4.18 score 4 scriptsbentaylor1
lgcp:Log-Gaussian Cox Process
Spatial and spatio-temporal modelling of point patterns using the log-Gaussian Cox process. Bayesian inference for spatial, spatiotemporal, multivariate and aggregated point processes using Markov chain Monte Carlo. See Benjamin M. Taylor, Tilman M. Davies, Barry S. Rowlingson, Peter J. Diggle (2015) <doi:10.18637/jss.v063.i07>.
Maintained by Benjamin M. Taylor. Last updated 1 years ago.
23.3 match 3.59 score 27 scriptsbioc
scMultiSim:Simulation of Multi-Modality Single Cell Data Guided By Gene Regulatory Networks and Cell-Cell Interactions
scMultiSim simulates paired single cell RNA-seq, single cell ATAC-seq and RNA velocity data, while incorporating mechanisms of gene regulatory networks, chromatin accessibility and cell-cell interactions. It allows users to tune various parameters controlling the amount of each biological factor, variation of gene-expression levels, the influence of chromatin accessibility on RNA sequence data, and so on. It can be used to benchmark various computational methods for single cell multi-omics data, and to assist in experimental design of wet-lab experiments.
Maintained by Hechen Li. Last updated 5 months ago.
singlecelltranscriptomicsgeneexpressionsequencingexperimentaldesign
11.4 match 23 stars 7.15 score 11 scriptsbioc
TAPseq:Targeted scRNA-seq primer design for TAP-seq
Design primers for targeted single-cell RNA-seq used by TAP-seq. Create sequence templates for target gene panels and design gene-specific primers using Primer3. Potential off-targets can be estimated with BLAST. Requires working installations of Primer3 and BLASTn.
Maintained by Andreas R. Gschwind. Last updated 5 months ago.
singlecellsequencingtechnologycrisprpooledscreens
15.0 match 4 stars 5.38 score 9 scriptsbioc
peakPantheR:Peak Picking and Annotation of High Resolution Experiments
An automated pipeline for the detection, integration and reporting of predefined features across a large number of mass spectrometry data files. It enables the real time annotation of multiple compounds in a single file, or the parallel annotation of multiple compounds in multiple files. A graphical user interface as well as command line functions will assist in assessing the quality of annotation and update fitting parameters until a satisfactory result is obtained.
Maintained by Arnaud Wolfer. Last updated 5 months ago.
massspectrometrymetabolomicspeakdetectionfeature-detectionmass-spectrometry
11.8 match 12 stars 6.82 score 23 scriptsbioc
dittoSeq:User Friendly Single-Cell and Bulk RNA Sequencing Visualization
A universal, user friendly, single-cell and bulk RNA sequencing visualization toolkit that allows highly customizable creation of color blindness friendly, publication-quality figures. dittoSeq accepts both SingleCellExperiment (SCE) and Seurat objects, as well as the import and usage, via conversion to an SCE, of SummarizedExperiment or DGEList bulk data. Visualizations include dimensionality reduction plots, heatmaps, scatterplots, percent composition or expression across groups, and more. Customizations range from size and title adjustments to automatic generation of annotations for heatmaps, overlay of trajectory analysis onto any dimensionality reduciton plot, hidden data overlay upon cursor hovering via ggplotly conversion, and many more. All with simple, discrete inputs. Color blindness friendliness is powered by legend adjustments (enlarged keys), and by allowing the use of shapes or letter-overlay in addition to the carefully selected dittoColors().
Maintained by Daniel Bunis. Last updated 5 months ago.
softwarevisualizationrnaseqsinglecellgeneexpressiontranscriptomicsdataimport
10.5 match 7.56 score 760 scripts 2 dependentspbiecek
bgmm:Gaussian Mixture Modeling Algorithms and the Belief-Based Mixture Modeling
Two partially supervised mixture modeling methods: soft-label and belief-based modeling are implemented. For completeness, we equipped the package also with the functionality of unsupervised, semi- and fully supervised mixture modeling. The package can be applied also to selection of the best-fitting from a set of models with different component numbers or constraints on their structures. For detailed introduction see: Przemyslaw Biecek, Ewa Szczurek, Martin Vingron, Jerzy Tiuryn (2012), The R Package bgmm: Mixture Modeling with Uncertain Knowledge, Journal of Statistical Software <doi:10.18637/jss.v047.i03>.
Maintained by Przemyslaw Biecek. Last updated 2 years ago.
18.7 match 2 stars 4.22 score 55 scripts 1 dependentsbioc
MethReg:Assessing the regulatory potential of DNA methylation regions or sites on gene transcription
Epigenome-wide association studies (EWAS) detects a large number of DNA methylation differences, often hundreds of differentially methylated regions and thousands of CpGs, that are significantly associated with a disease, many are located in non-coding regions. Therefore, there is a critical need to better understand the functional impact of these CpG methylations and to further prioritize the significant changes. MethReg is an R package for integrative modeling of DNA methylation, target gene expression and transcription factor binding sites data, to systematically identify and rank functional CpG methylations. MethReg evaluates, prioritizes and annotates CpG sites with high regulatory potential using matched methylation and gene expression data, along with external TF-target interaction databases based on manually curation, ChIP-seq experiments or gene regulatory network analysis.
Maintained by Tiago Silva. Last updated 5 months ago.
methylationarrayregressiongeneexpressionepigeneticsgenetargettranscription
14.4 match 5 stars 5.45 score 19 scriptsbioc
multicrispr:Multi-locus multi-purpose Crispr/Cas design
This package is for designing Crispr/Cas9 and Prime Editing experiments. It contains functions to (1) define and transform genomic targets, (2) find spacers (4) count offtarget (mis)matches, and (5) compute Doench2016/2014 targeting efficiency. Care has been taken for multicrispr to scale well towards large target sets, enabling the design of large Crispr/Cas9 libraries.
Maintained by Aditya Bhagwat. Last updated 4 months ago.
13.8 match 5.65 score 2 scriptsropensci
redland:RDF Library Bindings in R
Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at <https://www.w3.org/TR/rdf-primer/>. This package supports RDF by implementing an R interface to the Redland RDF C library, described at <https://librdf.org/docs/api/index.html>. In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.
Maintained by Matthew B. Jones. Last updated 1 years ago.
9.8 match 17 stars 7.85 score 98 scripts 13 dependentsbioc
signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis
This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.
Maintained by Brendan Gongol. Last updated 5 months ago.
softwaregeneexpressiongokeggnetworkenrichmentsequencingcoveragedifferentialexpressioncpp
10.6 match 17 stars 7.18 score 74 scripts 1 dependentscfwp
rags2ridges:Ridge Estimation of Precision Matrices from High-Dimensional Data
Proper L2-penalized maximum likelihood estimators for precision matrices and supporting functions to employ these estimators in a graphical modeling setting. For details, see Peeters, Bilgrau, & van Wieringen (2022) <doi:10.18637/jss.v102.i04> and associated publications.
Maintained by Carel F.W. Peeters. Last updated 1 years ago.
c-plus-plusgraphical-modelsmachine-learningnetworksciencestatisticsopenblascpp
13.2 match 8 stars 5.60 score 46 scriptsbioc
SIMAT:GC-SIM-MS data processing and alaysis tool
This package provides a pipeline for analysis of GC-MS data acquired in selected ion monitoring (SIM) mode. The tool also provides a guidance in choosing appropriate fragments for the targets of interest by using an optimization algorithm. This is done by considering overlapping peaks from a provided library by the user.
Maintained by M. R. Nezami Ranjbar. Last updated 5 months ago.
immunooncologysoftwaremetabolomicsmassspectrometry
17.3 match 4.26 score 1 scriptsbioc
fCI:f-divergence Cutoff Index for Differential Expression Analysis in Transcriptomics and Proteomics
(f-divergence Cutoff Index), is to find DEGs in the transcriptomic & proteomic data, and identify DEGs by computing the difference between the distribution of fold-changes for the control-control and remaining (non-differential) case-control gene expression ratio data. fCI provides several advantages compared to existing methods.
Maintained by Shaojun Tang. Last updated 5 months ago.
22.0 match 3.30 score 5 scriptsbioc
miRSM:Inferring miRNA sponge modules in heterogeneous data
The package aims to identify miRNA sponge or ceRNA modules in heterogeneous data. It provides several functions to study miRNA sponge modules at single-sample and multi-sample levels, including popular methods for inferring gene modules (candidate miRNA sponge or ceRNA modules), and two functions to identify miRNA sponge modules at single-sample and multi-sample levels, as well as several functions to conduct modular analysis of miRNA sponge modules.
Maintained by Junpeng Zhang. Last updated 5 months ago.
geneexpressionbiomedicalinformaticsclusteringgenesetenrichmentmicroarraysoftwaregeneregulationgenetargetcernamirnamirna-spongemirna-targetsmodulesopenjdk
12.8 match 4 stars 5.68 score 5 scriptsnhs-r-community
NHSRwaitinglist:R-package to implement a waiting list management approach
R-package to implement the waiting list management approach described in this paper by Fong et al 2022.
Maintained by Tom Smith. Last updated 4 months ago.
11.6 match 16 stars 6.06 score 17 scriptsgreat-northern-diver
loon:Interactive Statistical Data Visualization
An extendable toolkit for interactive data visualization and exploration.
Maintained by R. Wayne Oldford. Last updated 2 years ago.
data-analysisdata-sciencedata-visualizationexploratory-analysisexploratory-data-analysishigh-dimensional-datainteractive-graphicsinteractive-visualizationsloonpythonstatistical-analysisstatistical-graphicsstatisticstcl-extensiontk
7.5 match 48 stars 9.00 score 93 scripts 5 dependentsbioc
biotmle:Targeted Learning with Moderated Statistics for Biomarker Discovery
Tools for differential expression biomarker discovery based on microarray and next-generation sequencing data that leverage efficient semiparametric estimators of the average treatment effect for variable importance analysis. Estimation and inference of the (marginal) average treatment effects of potential biomarkers are computed by targeted minimum loss-based estimation, with joint, stable inference constructed across all biomarkers using a generalization of moderated statistics for use with the estimated efficient influence function. The procedure accommodates the use of ensemble machine learning for the estimation of nuisance functions.
Maintained by Nima Hejazi. Last updated 5 months ago.
regressiongeneexpressiondifferentialexpressionsequencingmicroarrayrnaseqimmunooncologybioconductorbioconductor-packagebioconductor-packagesbioinformaticsbiomarker-discoverybiostatisticscausal-inferencecomputational-biologymachine-learningstatisticstargeted-learning
12.6 match 5 stars 5.30 score 5 scriptsdaya6489
SmartEDA:Summarize and Explore the Data
Exploratory analysis on any input data describing the structure and the relationships present in the data. The package automatically select the variable and does related descriptive statistics. Analyzing information value, weight of evidence, custom tables, summary statistics, graphical techniques will be performed for both numeric and categorical predictors.
Maintained by Dayanand Ubrangala. Last updated 1 years ago.
analysisexploratory-data-analysis
9.1 match 42 stars 7.25 score 214 scriptsbioc
preprocessCore:A collection of pre-processing functions
A library of core preprocessing routines.
Maintained by Ben Bolstad. Last updated 5 months ago.
5.5 match 19 stars 12.03 score 1.8k scripts 204 dependentspablo14
funModeling:Exploratory Data Analysis and Data Preparation Tool-Box
Around 10% of almost any predictive modeling project is spent in predictive modeling, 'funModeling' and the book Data Science Live Book (<https://livebook.datascienceheroes.com/>) are intended to cover remaining 90%: data preparation, profiling, selecting best variables 'dataViz', assessing model performance and other functions.
Maintained by Pablo Casas. Last updated 2 years ago.
7.6 match 100 stars 8.57 score 654 scriptsirinagain
iglu:Interpreting Glucose Data from Continuous Glucose Monitors
Implements a wide range of metrics for measuring glucose control and glucose variability based on continuous glucose monitoring data. The list of implemented metrics is summarized in Rodbard (2009) <doi:10.1089/dia.2009.0015>. Additional visualization tools include time-series plots, lasagna plots and ambulatory glucose profile report.
Maintained by Irina Gaynanova. Last updated 9 days ago.
7.2 match 26 stars 9.00 score 39 scriptsbioc
decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data
Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.
differentialexpressionfunctionalgenomicsgeneexpressiongeneregulationnetworksoftwarestatisticalmethodtranscription
5.7 match 230 stars 11.27 score 316 scripts 3 dependentsbioc
TFutils:TFutils
This package helps users to work with TF metadata from various sources. Significant catalogs of TFs and classifications thereof are made available. Tools for working with motif scans are also provided.
Maintained by Vincent Carey. Last updated 4 months ago.
13.1 match 4.80 score 21 scriptsbioc
FindIT2:find influential TF and Target based on multi-omics data
This package implements functions to find influential TF and target based on different input type. It have five module: Multi-peak multi-gene annotaion(mmPeakAnno module), Calculate regulation potential(calcRP module), Find influential Target based on ChIP-Seq and RNA-Seq data(Find influential Target module), Find influential TF based on different input(Find influential TF module), Calculate peak-gene or peak-peak correlation(peakGeneCor module). And there are also some other useful function like integrate different source information, calculate jaccard similarity for your TF.
Maintained by Guandong Shang. Last updated 5 months ago.
softwareannotationchipseqatacseqgeneregulationmultiplecomparisongenetarget
12.0 match 6 stars 5.26 score 7 scriptsbioc
DeepTarget:Deep characterization of cancer drugs
This package predicts a drugโs primary target(s) or secondary target(s) by integrating large-scale genetic and drug screens from the Cancer Dependency Map project run by the Broad Institute. It further investigates whether the drug specifically targets the wild-type or mutated target forms. To show how to use this package in practice, we provided sample data along with step-by-step example.
Maintained by Trinh Nguyen. Last updated 5 months ago.
genetargetgenepredictionpathwaysgeneexpressionrnaseqimmunooncologydifferentialexpressiongenesetenrichmentreportwritingcrispr
13.8 match 4.54 score 1 scriptsrichardgeveritt
ggsmc:Visualising Output from Sequential Monte Carlo and Ensemble-Based Methods
Functions for plotting, and animating, the output of importance samplers, sequential Monte Carlo samplers (SMC) and ensemble-based methods. The package can be used to plot and animate histograms, densities, scatter plots and time series, and to plot the genealogy of an SMC or ensemble-based algorithm. These functions all rely on algorithm output to be supplied in tidy format. A function is provided to transform algorithm output from matrix format (one Monte Carlo point per row) to the tidy format required by the plotting and animating functions.
Maintained by Richard G Everitt. Last updated 2 months ago.
13.6 match 4.48 score 6 scriptsadrianhordyk
DLMtool:Data-Limited Methods Toolkit
A collection of data-limited management procedures that can be evaluated with management strategy evaluation with the 'MSEtool' package, or applied to fishery data to provide management recommendations.
Maintained by Adrian Hordyk. Last updated 3 years ago.
16.5 match 1 stars 3.67 score 229 scripts 1 dependentsnyuglobalties
blueprintr:Automagically Document and Test Datasets Using Targets Or Drake
Documents and tests datasets in a reproducible manner so that data lineage is easier to comprehend for small to medium tabular data. Originally designed to aid data cleaning tasks for humanitarian research groups, specifically large-scale longitudinal studies.
Maintained by Patrick Anker. Last updated 8 months ago.
17.7 match 1 stars 3.40 score 7 scriptsnt-williams
lmtp:Non-Parametric Causal Effects of Feasible Interventions Based on Modified Treatment Policies
Non-parametric estimators for casual effects based on longitudinal modified treatment policies as described in Diaz, Williams, Hoffman, and Schenck <doi:10.1080/01621459.2021.1955691>, traditional point treatment, and traditional longitudinal effects. Continuous, binary, categorical treatments, and multivariate treatments are allowed as well are censored outcomes. The treatment mechanism is estimated via a density ratio classification procedure irrespective of treatment variable type. For both continuous and binary outcomes, additive treatment effects can be calculated and relative risks and odds ratios may be calculated for binary outcomes. Supports survival outcomes with competing risks (Diaz, Hoffman, and Hejazi; <doi:10.1007/s10985-023-09606-7>).
Maintained by Nicholas Williams. Last updated 8 days ago.
causal-inferencecensored-datalongitudinal-datamachine-learningmodified-treatment-policynonparametric-statisticsprecision-medicinerobust-statisticsstatisticsstochastic-interventionssurvival-analysistargeted-learning
9.3 match 64 stars 6.37 score 91 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
7.2 match 3 stars 8.20 score 7.8k scripts 11 dependentsbioc
drugTargetInteractions:Drug-Target Interactions
Provides utilities for identifying drug-target interactions for sets of small molecule or gene/protein identifiers. The required drug-target interaction information is obained from a local SQLite instance of the ChEMBL database. ChEMBL has been chosen for this purpose, because it provides one of the most comprehensive and best annotatated knowledge resources for drug-target information available in the public domain.
Maintained by Thomas Girke. Last updated 5 months ago.
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsproteomicsmetabolomics
13.5 match 1 stars 4.34 score 11 scriptstdhock
penaltyLearning:Penalty Learning
Implementations of algorithms from Learning Sparse Penalties for Change-point Detection using Max Margin Interval Regression, by Hocking, Rigaill, Vert, Bach <http://proceedings.mlr.press/v28/hocking13.html> published in proceedings of ICML2013.
Maintained by Toby Dylan Hocking. Last updated 5 months ago.
9.5 match 16 stars 6.13 score 129 scripts 2 dependentscjvanlissa
worcs:Workflow for Open Reproducible Code in Science
Create reproducible and transparent research projects in 'R'. This package is based on the Workflow for Open Reproducible Code in Science (WORCS), a step-by-step procedure based on best practices for Open Science. It includes an 'RStudio' project template, several convenience functions, and all dependencies required to make your project reproducible and transparent. WORCS is explained in the tutorial paper by Van Lissa, Brandmaier, Brinkman, Lamprecht, Struiksma, & Vreede (2021). <doi:10.3233/DS-210031>.
Maintained by Caspar J. Van Lissa. Last updated 10 days ago.
6.2 match 83 stars 9.26 score 59 scriptsjeffreyevans
yaImpute:Nearest Neighbor Observation Imputation and Evaluation Tools
Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.
Maintained by Jeffrey S. Evans. Last updated 6 months ago.
7.7 match 3 stars 7.40 score 94 scripts 12 dependentsbluefoxr
COINr:Composite Indicator Construction and Analysis
A comprehensive high-level package, for composite indicator construction and analysis. It is a "development environment" for composite indicators and scoreboards, which includes utilities for construction (indicator selection, denomination, imputation, data treatment, normalisation, weighting and aggregation) and analysis (multivariate analysis, correlation plotting, short cuts for principal component analysis, global sensitivity analysis, and more). A composite indicator is completely encapsulated inside a single hierarchical list called a "coin". This allows a fast and efficient work flow, as well as making quick copies, testing methodological variations and making comparisons. It also includes many plotting options, both statistical (scatter plots, distribution plots) as well as for presenting results.
Maintained by William Becker. Last updated 2 months ago.
6.3 match 26 stars 9.07 score 73 scripts 1 dependentscausal-lda
TrialEmulation:Causal Analysis of Observational Time-to-Event Data
Implements target trial emulation methods to apply randomized clinical trial design and analysis in an observational setting. Using marginal structural models, it can estimate intention-to-treat and per-protocol effects in emulated trials using electronic health records. A description and application of the method can be found in Danaei et al (2013) <doi:10.1177/0962280211403603>.
Maintained by Isaac Gravestock. Last updated 22 days ago.
causal-inferencelongitudinal-datasurvival-analysiscpp
7.3 match 25 stars 7.72 score 29 scriptsbioc
POMA:Tools for Omics Data Analysis
The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.
Maintained by Pol Castellano-Escuder. Last updated 4 months ago.
batcheffectclassificationclusteringdecisiontreedimensionreductionmultidimensionalscalingnormalizationpreprocessingprincipalcomponentregressionrnaseqsoftwarestatisticalmethodvisualizationbioconductorbioinformaticsdata-visualizationdimension-reductionexploratory-data-analysismachine-learningomics-data-integrationpipelinepre-processingstatistical-analysisuser-friendlyworkflow
6.8 match 11 stars 8.23 score 20 scripts 1 dependentsc-rutter
imabc:Incremental Mixture Approximate Bayesian Computation (IMABC)
Provides functionality to perform a likelihood-free method for estimating the parameters of complex models that results in a simulated sample from the posterior distribution of model parameters given targets. The method begins with a accept/reject approximate bayes computation (ABC) step applied to a sample of points from the prior distribution of model parameters. Accepted points result in model predictions that are within the initially specified tolerance intervals around the target points. The sample is iteratively updated by drawing additional points from a mixture of multivariate normal distributions, accepting points within tolerance intervals. As the algorithm proceeds, the acceptance intervals are narrowed. The algorithm returns a set of points and sampling weights that account for the adaptive sampling scheme. For more details see Rutter, Ozik, DeYoreo, and Collier (2018) <arXiv:1804.02090>.
Maintained by "Christopher, E. Maerzluft". Last updated 2 years ago.
10.3 match 8 stars 5.38 score 7 scriptsbioc
BulkSignalR:Infer Ligand-Receptor Interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics
Inference of ligand-receptor (LR) interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics. BulkSignalR bases its inferences on the LRdb database included in our other package, SingleCellSignalR available from Bioconductor. It relies on a statistical model that is specific to bulk data sets. Different visualization and data summary functions are proposed to help navigating prediction results.
Maintained by Jean-Philippe Villemin. Last updated 3 months ago.
networkrnaseqsoftwareproteomicstranscriptomicsnetworkinferencespatial
10.6 match 5.22 score 15 scriptsr-forge
pcalg:Methods for Graphical Models and Causal Inference
Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.
Maintained by Markus Kalisch. Last updated 6 months ago.
7.5 match 7.32 score 700 scripts 19 dependentsbioc
EnrichedHeatmap:Making Enriched Heatmaps
Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions. Here we implement enriched heatmap by ComplexHeatmap package. Since this type of heatmap is just a normal heatmap but with some special settings, with the functionality of ComplexHeatmap, it would be much easier to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.
Maintained by Zuguang Gu. Last updated 5 months ago.
softwarevisualizationsequencinggenomeannotationcoveragecpp
5.1 match 190 stars 10.87 score 330 scripts 1 dependentsbioc
RAREsim:Simulation of Rare Variant Genetic Data
Haplotype simulations of rare variant genetic data that emulates real data can be performed with RAREsim. RAREsim uses the expected number of variants in MAC bins - either as provided by default parameters or estimated from target data - and an abundance of rare variants as simulated HAPGEN2 to probabilistically prune variants. RAREsim produces haplotypes that emulate real sequencing data with respect to the total number of variants, allele frequency spectrum, haplotype structure, and variant annotation.
Maintained by Ryan Barnard. Last updated 5 months ago.
geneticssoftwarevariantannotationsequencing
11.8 match 4 stars 4.60 score 4 scriptsbioc
gCrisprTools:Suite of Functions for Pooled Crispr Screen QC and Analysis
Set of tools for evaluating pooled high-throughput screening experiments, typically employing CRISPR/Cas9 or shRNA expression cassettes. Contains methods for interrogating library and cassette behavior within an experiment, identifying differentially abundant cassettes, aggregating signals to identify candidate targets for empirical validation, hypothesis testing, and comprehensive reporting. Version 2.0 extends these applications to include a variety of tools for contextualizing and integrating signals across many experiments, incorporates extended signal enrichment methodologies via the "sparrow" package, and streamlines many formal requirements to aid in interpretablity.
Maintained by Russell Bainer. Last updated 5 months ago.
immunooncologycrisprpooledscreensexperimentaldesignbiomedicalinformaticscellbiologyfunctionalgenomicspharmacogenomicspharmacogeneticssystemsbiologydifferentialexpressiongenesetenrichmentgeneticsmultiplecomparisonnormalizationpreprocessingqualitycontrolrnaseqregressionsoftwarevisualization
11.3 match 4.78 score 8 scriptsflr
mse:Tools for Running Management Strategy Evaluations using FLR
A set of functions and methods to enable the development and running of Management Strategy Evaluation (MSE) analyses, using the FLR packages and classes and the a4a methods and algorithms.
Maintained by Iago Mosqueira. Last updated 20 days ago.
7.6 match 4 stars 7.04 score 137 scripts 3 dependentsjoshuaschwab
ltmle:Longitudinal Targeted Maximum Likelihood Estimation
Targeted Maximum Likelihood Estimation ('TMLE') of treatment/censoring specific mean outcome or marginal structural model for point-treatment and longitudinal data. Petersen et al. (2014) <doi:10.1515/jci-2013-0007>
Maintained by Joshua Schwab. Last updated 2 years ago.
8.7 match 23 stars 6.15 score 207 scriptsjiefei-wang
aws.ecx:Communicating with AWS EC2 and ECS using AWS REST APIs
Providing the functions for communicating with Amazon Web Services(AWS) Elastic Compute Cloud(EC2) and Elastic Container Service(ECS). The functions will have the prefix 'ecs_' or 'ec2_' depending on the class of the API. The request will be sent via the REST API and the parameters are given by the function argument. The credentials can be set via 'aws_set_credentials'. The EC2 documentation can be found at <https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Welcome.html> and ECS can be found at <https://docs.aws.amazon.com/AmazonECS/latest/APIReference/Welcome.html>.
Maintained by Jiefei Wang. Last updated 3 years ago.
12.7 match 1 stars 4.18 score 2 scriptsreichlab
zoltr:Interface to the 'Zoltar' Forecast Repository API
'Zoltar' <https://www.zoltardata.com/> is a website that provides a repository of model forecast results in a standardized format and a central location. It supports storing, retrieving, comparing, and analyzing time series forecasts for prediction challenges of interest to the modeling community. This package provides functions for working with the 'Zoltar' API, including connecting and authenticating, getting meta information (projects, models, and forecasts, and truth), and uploading, downloading, and deleting forecast and truth data.
Maintained by Matthew Cornell. Last updated 9 days ago.
7.0 match 2 stars 7.58 score 175 scripts 3 dependentsbioc
CrispRVariants:Tools for counting and visualising mutations in a target location
CrispRVariants provides tools for analysing the results of a CRISPR-Cas9 mutagenesis sequencing experiment, or other sequencing experiments where variants within a given region are of interest. These tools allow users to localize variant allele combinations with respect to any genomic location (e.g. the Cas9 cut site), plot allele combinations and calculate mutation rates with flexible filtering of unrelated variants.
Maintained by Helen Lindsay. Last updated 5 months ago.
immunooncologycrisprgenomicvariationvariantdetectiongeneticvariabilitydatarepresentationvisualizationsequencing
9.5 match 5.51 score 32 scriptsjohnmackintosh
cusumcharter:Easier CUSUM Control Charts
Create CUSUM (cumulative sum) statistics from a vector or dataframe. Also create single or faceted CUSUM control charts, with or without control limits. Accepts vector, dataframe, tibble or data.table inputs.
Maintained by John MacKintosh. Last updated 4 months ago.
cusumggplot2health-informaticshealthcarequality-improvementrdatatablestatistical-process-control
10.1 match 27 stars 5.13 score 9 scriptsnhejazi
txshift:Efficient Estimation of the Causal Effects of Stochastic Interventions
Efficient estimation of the population-level causal effects of stochastic interventions on a continuous-valued exposure. Both one-step and targeted minimum loss estimators are implemented for the counterfactual mean value of an outcome of interest under an additive modified treatment policy, a stochastic intervention that may depend on the natural value of the exposure. To accommodate settings with outcome-dependent two-phase sampling, procedures incorporating inverse probability of censoring weighting are provided to facilitate the construction of inefficient and efficient one-step and targeted minimum loss estimators. The causal parameter and its estimation were first described by Dรญaz and van der Laan (2013) <doi:10.1111/j.1541-0420.2011.01685.x>, while the multiply robust estimation procedure and its application to data from two-phase sampling designs is detailed in NS Hejazi, MJ van der Laan, HE Janes, PB Gilbert, and DC Benkeser (2020) <doi:10.1111/biom.13375>. The software package implementation is described in NS Hejazi and DC Benkeser (2020) <doi:10.21105/joss.02447>. Estimation of nuisance parameters may be enhanced through the Super Learner ensemble model in 'sl3', available for download from GitHub using 'remotes::install_github("tlverse/sl3")'.
Maintained by Nima Hejazi. Last updated 6 months ago.
causal-effectscausal-inferencecensored-datamachine-learningrobust-statisticsstatisticsstochastic-interventionsstochastic-treatment-regimestargeted-learningtreatment-effectsvariable-importance
9.9 match 14 stars 5.12 score 19 scriptsbioc
rtracklayer:R interface to genome annotation files and the UCSC genome browser
Extensible framework for interacting with multiple genome browsers (currently UCSC built-in) and manipulating annotation tracks in various formats (currently GFF, BED, bedGraph, BED15, WIG, BigWig and 2bit built-in). The user may export/import tracks to/from the supported browsers, as well as query and modify the browser state, such as the current viewport.
Maintained by Michael Lawrence. Last updated 7 days ago.
annotationvisualizationdataimportzlibopensslcurl
4.0 match 12.66 score 6.7k scripts 481 dependentstjmahr
notestar:Notebooks Using 'Targets' and 'Bookdown'
'Targets' is an R package for dependency and build management in data analysis projects. This package provides a set of targets and project infrastructure to create 'bookdown'-based notebooks using 'targets'.
Maintained by Tristan Mahr. Last updated 2 months ago.
bookdownknitrpandocrmarkdowntargets
15.9 match 30 stars 3.18 score 7 scriptsssnn-airr
shazam:Immunoglobulin Somatic Hypermutation Analysis
Provides a computational framework for analyzing mutations in immunoglobulin (Ig) sequences. Includes methods for Bayesian estimation of antigen-driven selection pressure, mutational load quantification, building of somatic hypermutation (SHM) models, and model-dependent distance calculations. Also includes empirically derived models of SHM for both mice and humans. Citations: Gupta and Vander Heiden, et al (2015) <doi:10.1093/bioinformatics/btv359>, Yaari, et al (2012) <doi:10.1093/nar/gks457>, Yaari, et al (2013) <doi:10.3389/fimmu.2013.00358>, Cui, et al (2016) <doi:10.4049/jimmunol.1502263>.
Maintained by Susanna Marquez. Last updated 2 months ago.
6.8 match 7.43 score 222 scripts 2 dependentsbioc
circRNAprofiler:circRNAprofiler: An R-Based Computational Framework for the Downstream Analysis of Circular RNAs
R-based computational framework for a comprehensive in silico analysis of circRNAs. This computational framework allows to combine and analyze circRNAs previously detected by multiple publicly available annotation-based circRNA detection tools. It covers different aspects of circRNAs analysis from differential expression analysis, evolutionary conservation, biogenesis to functional analysis.
Maintained by Simona Aufiero. Last updated 5 months ago.
annotationstructuralpredictionfunctionalpredictiongenepredictiongenomeassemblydifferentialexpression
8.7 match 10 stars 5.78 score 5 scriptsbioc
PureCN:Copy number calling and SNV classification using targeted short read sequencing
This package estimates tumor purity, copy number, and loss of heterozygosity (LOH), and classifies single nucleotide variants (SNVs) by somatic status and clonality. PureCN is designed for targeted short read sequencing data, integrates well with standard somatic variant detection and copy number pipelines, and has support for tumor samples without matching normal samples.
Maintained by Markus Riester. Last updated 2 months ago.
copynumbervariationsoftwaresequencingvariantannotationvariantdetectioncoverageimmunooncologybioconductor-packagecell-free-dnacopy-numberlohtumor-heterogeneitytumor-mutational-burdentumor-purity
5.1 match 132 stars 9.72 score 40 scriptspsychelzh
tarflow.iquizoo:Setup "targets" Workflows for "iquizoo" Data Processing
For "iquizoo" data processing, there is already a package called "preproc.iquizoo", but eventually the use of it is relied on a workflow. This package is used to build such workflows based on tools provided by "targets" package which mimics the logic of "make", automating the building processes.
Maintained by Liang Zhang. Last updated 5 months ago.
13.9 match 10 stars 3.60 score 1 scriptsbioc
crisprBowtie:Bowtie-based alignment of CRISPR gRNA spacer sequences
Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bowtie. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Both DNA- and RNA-targeting nucleases are supported.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsalignmentalignerbioconductorbioconductor-packagebowtiecrispr-analysiscrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequencessgrnasgrna-design
8.4 match 3 stars 5.86 score 7 scripts 4 dependentsropensci
git2r:Provides Access to Git Repositories
Interface to the 'libgit2' library, which is a pure C implementation of the 'Git' core methods. Provides access to 'Git' repositories to extract data and running some basic 'Git' commands.
Maintained by Stefan Widgren. Last updated 10 days ago.
gitgit-clientlibgit2libgit2-library
3.5 match 218 stars 13.86 score 836 scripts 49 dependentsdarwin-eu
IncidencePrevalence:Estimate Incidence and Prevalence using the OMOP Common Data Model
Calculate incidence and prevalence using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. Incidence and prevalence can be estimated for the total population in a database or for a stratification cohort.
Maintained by Edward Burn. Last updated 5 days ago.
6.1 match 9 stars 7.96 score 102 scripts 1 dependentsopenbiox
UCSCXenaShiny:Interactive Analysis of UCSC Xena Data
Provides functions and a Shiny application for downloading, analyzing and visualizing datasets from UCSC Xena (<http://xena.ucsc.edu/>), which is a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others.
Maintained by Shixiang Wang. Last updated 4 months ago.
cancer-datasetshiny-appsucsc-xena
5.7 match 96 stars 8.54 score 35 scriptsjarretrt
tci:Target Controlled Infusion (TCI)
Implementation of target-controlled infusion algorithms for compartmental pharmacokinetic and pharmacokinetic-pharmacodynamic models. Jacobs (1990) <doi:10.1109/10.43622>; Marsh et al. (1991) <doi:10.1093/bja/67.1.41>; Shafer and Gregg (1993) <doi:10.1007/BF01070999>; Schnider et al. (1998) <doi:10.1097/00000542-199805000-00006>; Abuhelwa, Foster, and Upton (2015) <doi:10.1016/j.vascn.2015.03.004>; Eleveld et al. (2018) <doi:10.1016/j.bja.2018.01.018>.
Maintained by Ryan Jarrett. Last updated 2 years ago.
13.7 match 6 stars 3.48 score 8 scriptsbhklab
mRMRe:Parallelized Minimum Redundancy, Maximum Relevance (mRMR)
Computes mutual information matrices from continuous, categorical and survival variables, as well as feature selection with minimum redundancy, maximum relevance (mRMR) and a new ensemble mRMR technique. Published in De Jay et al. (2013) <doi:10.1093/bioinformatics/btt383>.
Maintained by Benjamin Haibe-Kains. Last updated 4 years ago.
5.3 match 19 stars 8.95 score 105 scripts 2 dependentsjucheng1992
ctmle:Collaborative Targeted Maximum Likelihood Estimation
Implements the general template for collaborative targeted maximum likelihood estimation. It also provides several commonly used C-TMLE instantiation, like the vanilla/scalable variable-selection C-TMLE (Ju et al. (2017) <doi:10.1177/0962280217729845>) and the glmnet-C-TMLE algorithm (Ju et al. (2017) <arXiv:1706.10029>).
Maintained by Cheng Ju. Last updated 5 years ago.
causal-inferencemachine-learningstatisticstmle
9.8 match 5 stars 4.83 score 27 scriptsmommy003
MSML:Model Selection Based on Machine Learning (ML)
Model evaluation based on a modified version of the recursive feature elimination algorithm. This package is designed to determine the optimal model(s) by leveraging all available features.
Maintained by Moksedul Momin. Last updated 9 months ago.
13.6 match 3.48 score 1 scriptstlverse
tmle3mopttx:Targeted Maximum Likelihood Estimation of the Mean under Optimal Individualized Treatment
This package estimates the optimal individualized treatment rule for the categorical treatment using Super Learner (sl3). In order to avoid nested cross-validation, it uses split-specific estimates of Q and g to estimate the rule as described by Coyle et al. In addition, it provides the Targeted Maximum Likelihood estimates of the mean performance using CV-TMLE under such estimated rules. This is an adapter package for use with the tmle3 framework and the tlverse software ecosystem for Targeted Learning.
Maintained by Ivana Malenica. Last updated 3 years ago.
categorical-treatmentcausal-inferenceheterogeneous-effectsmachine-learningoptimal-individualized-treatmenttargeted-learningvariable-importance
11.1 match 12 stars 4.25 score 49 scripts 1 dependentsropensci
tidyqpcr:Quantitative PCR Analysis with the Tidyverse
For reproducible quantitative PCR (qPCR) analysis building on packages from the โtidyverseโ, notably โdplyrโ and โggplot2โ. It normalizes (by ddCq), summarizes, and plots pre-calculated Cq data, and plots raw amplification and melt curves from Roche Lightcycler (tm) machines. It does NOT (yet) calculate Cq data from amplification curves.
Maintained by Edward Wallace. Last updated 11 months ago.
miqeqpcrqpcr-analysistidyverse
8.3 match 54 stars 5.64 score 20 scriptsgeco-bern
rsofun:The P-Model and BiomeE Modelling Framework
Implements the Simulating Optimal FUNctioning framework for site-scale simulations of ecosystem processes, including model calibration. It contains 'Fortran 90' modules for the P-model (Stocker et al. (2020) <doi:10.5194/gmd-13-1545-2020>), SPLASH (Davis et al. (2017) <doi:10.5194/gmd-10-689-2017>) and BiomeE (Weng et al. (2015) <doi:10.5194/bg-12-2655-2015>).
Maintained by Benjamin Stocker. Last updated 12 days ago.
dgvmgrowthmodelingp-modelsimulationvegetation-dynamicsfortran
5.3 match 26 stars 8.77 score 119 scriptshanjunwei-lab
DTSEA:Drug Target Set Enrichment Analysis
It is a novel tool used to identify the candidate drugs against a particular disease based on the drug target set enrichment analysis. It assumes the most effective drugs are those with a closer affinity in the protein-protein interaction network to the specified disease. (See Gรณmez-Carballa et al. (2022) <doi: 10.1016/j.envres.2022.112890> and Feng et al. (2022) <doi: 10.7150/ijms.67815> for disease expression profiles; see Wishart et al. (2018) <doi: 10.1093/nar/gkx1037> and Gaulton et al. (2017) <doi: 10.1093/nar/gkw1074> for drug target information; see Kanehisa et al. (2021) <doi: 10.1093/nar/gkaa970> for the details of KEGG database.)
Maintained by Junwei Han. Last updated 2 years ago.
10.7 match 4.32 score 42 scriptsbioc
memes:motif matching, comparison, and de novo discovery using the MEME Suite
A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.
Maintained by Spencer Nystrom. Last updated 5 months ago.
dataimportfunctionalgenomicsgeneregulationmotifannotationmotifdiscoverysequencematchingsoftware
5.3 match 49 stars 8.68 score 117 scripts 1 dependentsstevenmmortimer
rdfp:An Implementation of the 'DoubleClick for Publishers' API
Functions to interact with the 'Google DoubleClick for Publishers (DFP)' API <https://developers.google.com/ad-manager/api/start> (recently renamed to 'Google Ad Manager'). This package is automatically compiled from the API WSDL (Web Service Description Language) files to dictate how the API is structured. Theoretically, all API actions are possible using this package; however, care must be taken to format the inputs correctly and parse the outputs correctly. Please see the 'Google Ad Manager' API reference <https://developers.google.com/ad-manager/api/rel_notes> and this package's website <https://stevenmmortimer.github.io/rdfp/> for more information, documentation, and examples.
Maintained by Steven M. Mortimer. Last updated 6 years ago.
api-clientapi-wrapperdfpdfp-apidoubleclickdoubleclick-for-publishersgoogle-dfp
6.5 match 16 stars 6.93 score 214 scriptsuscbiostats
partition:Agglomerative Partitioning Framework for Dimension Reduction
A fast and flexible framework for agglomerative partitioning. 'partition' uses an approach called Direct-Measure-Reduce to create new variables that maintain the user-specified minimum level of information. Each reduced variable is also interpretable: the original variables map to one and only one variable in the reduced data set. 'partition' is flexible, as well: how variables are selected to reduce, how information loss is measured, and the way data is reduced can all be customized. 'partition' is based on the Partition framework discussed in Millstein et al. (2020) <doi:10.1093/bioinformatics/btz661>.
Maintained by Malcolm Barrett. Last updated 4 months ago.
data-reductiondimensionality-reductionpartitional-clusteringopenblascpp
5.8 match 36 stars 7.72 score 27 scripts 1 dependentsmartin3141
spant:MR Spectroscopy Analysis Tools
Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.
Maintained by Martin Wilson. Last updated 29 days ago.
brainmrimrsmrshubspectroscopyfortran
5.2 match 24 stars 8.55 score 81 scriptsbioc
miRNAtap:miRNAtap: microRNA Targets - Aggregated Predictions
The package facilitates implementation of workflows requiring miRNA predictions, it allows to integrate ranked miRNA target predictions from multiple sources available online and aggregate them with various methods which improves quality of predictions above any of the single sources. Currently predictions are available for Homo sapiens, Mus musculus and Rattus norvegicus (the last one through homology translation).
Maintained by T. Ian Simpson. Last updated 5 months ago.
softwareclassificationmicroarraysequencingmirna
8.8 match 4.94 score 44 scriptsyingjie4science
SDGdetector:Detect SDGs and Targets in Text
Identify 17 Sustainable Development Goals and associated 169 targets in text.
Maintained by Yingjie Li. Last updated 6 months ago.
sdgsdgssustainabilitysustainable-development-goalstext-mining
10.5 match 14 stars 4.15 score 10 scriptsbioc
maftools:Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Maintained by Anand Mayakonda. Last updated 5 months ago.
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
3.0 match 459 stars 14.63 score 948 scripts 18 dependentschris-prener
areal:Areal Weighted Interpolation
A pipeable, transparent implementation of areal weighted interpolation with support for interpolating multiple variables in a single function call. These tools provide a full-featured workflow for validation and estimation that fits into both modern data management (e.g. tidyverse) and spatial data (e.g. sf) frameworks.
Maintained by Christopher Prener. Last updated 3 years ago.
4.9 match 93 stars 8.88 score 106 scripts 4 dependentsepiverse-trace
epidemics:Composable Epidemic Scenario Modelling
A library of compartmental epidemic models taken from the published literature, and classes to represent affected populations, public health response measures including non-pharmaceutical interventions on social contacts, non-pharmaceutical and pharmaceutical interventions that affect disease transmissibility, vaccination regimes, and disease seasonality, which can be combined to compose epidemic scenario models.
Maintained by Rosalind Eggo. Last updated 9 months ago.
decision-supportepidemic-modellingepidemic-simulationsepidemiologyepiverseinfectious-disease-dynamicsmodel-librarynon-pharmaceutical-interventionsrcpprcppeigenscenario-analysisvaccinationcpp
5.7 match 9 stars 7.48 score 59 scriptsusepa
tcpl:ToxCast Data Analysis Pipeline
The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.
Maintained by Jason Brown. Last updated 1 days ago.
4.5 match 36 stars 9.41 score 90 scriptsprioriactions
prioriactions:Multi-Action Conservation Planning
This uses a mixed integer mathematical programming (MIP) approach for building and solving multi-action planning problems, where the goal is to find an optimal combination of management actions that abate threats, in an efficient way while accounting for spatial aspects. Thus, optimizing the connectivity and conservation effectiveness of the prioritized units and of the deployed actions. The package is capable of handling different commercial (gurobi, CPLEX) and non-commercial (symphony, CBC) MIP solvers. Gurobi optimization solver can be installed using comprehensive instructions in the 'gurobi' installation vignette of the prioritizr package (available in <https://prioritizr.net/articles/gurobi_installation_guide.html>). Instead, 'CPLEX' optimization solver can be obtain from IBM CPLEX web page (available here <https://www.ibm.com/es-es/products/ilog-cplex-optimization-studio>). Additionally, the 'rcbc' R package (available at <https://github.com/dirkschumacher/rcbc>) can be used to obtain solutions using the CBC optimization software (<https://github.com/coin-or/Cbc>). Methods used in the package refers to Salgado-Rojas et al. (2020) <doi:10.1016/j.ecolmodel.2019.108901>, Beyer et al. (2016) <doi:10.1016/j.ecolmodel.2016.02.005>, Cattarino et al. (2015) <doi:10.1371/journal.pone.0128027> and Watts et al. (2009) <doi:10.1016/j.envsoft.2009.06.005>. See the prioriactions website for more information, documentations and examples.
Maintained by Jose Salgado-Rojas. Last updated 2 years ago.
conservationconservation-planoptimizationprioritizationthreatscpp
7.8 match 10 stars 5.40 score 6 scriptsscmethods
scregclust:Reconstructing the Regulatory Programs of Target Genes in scRNA-Seq Data
Implementation of the scregclust algorithm described in Larsson, Held, et al. (2024) <doi:10.1038/s41467-024-53954-3> which reconstructs regulatory programs of target genes in scRNA-seq data. Target genes are clustered into modules and each module is associated with a linear model describing the regulatory program.
Maintained by Felix Held. Last updated 2 months ago.
clusteringregulatory-programsscrna-seq-analysiscppopenmp
6.5 match 9 stars 6.45 score 21 scriptsjohnaponte
repana:Repeatable Analysis in R
Set of utilities to facilitate the reproduction of analysis in R. It allow to make_structure(), clean_structure(), and run and log programs in a predefined order to allow secondary files, analysis and reports be constructed in an ordered and reproducible form.
Maintained by John J. Aponte. Last updated 21 days ago.
7.0 match 5 stars 5.98 score 19 scriptsbioc
GSCA:GSCA: Gene Set Context Analysis
GSCA takes as input several lists of activated and repressed genes. GSCA then searches through a compendium of publicly available gene expression profiles for biological contexts that are enriched with a specified pattern of gene expression. GSCA provides both traditional R functions and interactive, user-friendly user interface.
Maintained by Zhicheng Ji. Last updated 5 months ago.
geneexpressionvisualizationgui
8.2 match 5.00 score 5 scriptscibiobcg
EthSEQ:Ethnicity Annotation from Whole-Exome and Targeted Sequencing Data
Reliable and rapid ethnicity annotation from whole exome and targeted sequencing data.
Maintained by Alessandro Romanel. Last updated 2 years ago.
ethnicity-analysisexome-sequencinggenotype-datacpp
7.9 match 15 stars 5.18 score 10 scriptscran
sna:Tools for Social Network Analysis
A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, network regression, random graph generation, and 2D/3D network visualization.
Maintained by Carter T. Butts. Last updated 6 months ago.
6.0 match 8 stars 6.78 score 94 dependentsbioc
TargetDecoy:Diagnostic Plots to Evaluate the Target Decoy Approach
A first step in the data analysis of Mass Spectrometry (MS) based proteomics data is to identify peptides and proteins. With this respect the huge number of experimental mass spectra typically have to be assigned to theoretical peptides derived from a sequence database. Search engines are used for this purpose. These tools compare each of the observed spectra to all candidate theoretical spectra derived from the sequence data base and calculate a score for each comparison. The observed spectrum is then assigned to the theoretical peptide with the best score, which is also referred to as the peptide to spectrum match (PSM). It is of course crucial for the downstream analysis to evaluate the quality of these matches. Therefore False Discovery Rate (FDR) control is used to return a reliable list PSMs. The FDR, however, requires a good characterisation of the score distribution of PSMs that are matched to the wrong peptide (bad target hits). In proteomics, the target decoy approach (TDA) is typically used for this purpose. The TDA method matches the spectra to a database of real (targets) and nonsense peptides (decoys). A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.
Maintained by Elke Debrie. Last updated 5 months ago.
massspectrometryproteomicsqualitycontrolsoftwarevisualizationbioconductormass-spectrometry
8.8 match 1 stars 4.60 score 9 scriptsbioc
MSstatsPTM:Statistical Characterization of Post-translational Modifications
MSstatsPTM provides general statistical methods for quantitative characterization of post-translational modifications (PTMs). Supports DDA, DIA, SRM, and tandem mass tag (TMT) labeling. Typically, the analysis involves the quantification of PTM sites (i.e., modified residues) and their corresponding proteins, as well as the integration of the quantification results. MSstatsPTM provides functions for summarization, estimation of PTM site abundance, and detection of changes in PTMs across experimental conditions.
Maintained by Devon Kohler. Last updated 4 months ago.
immunooncologymassspectrometryproteomicssoftwaredifferentialexpressiononechanneltwochannelnormalizationqualitycontrolpost-translational-modificationcpp
5.0 match 10 stars 7.98 score 36 scripts 2 dependentssinhrks
ggfortify:Data Visualization Tools for Statistical Analysis Results
Unified plotting tools for statistics commonly used, such as GLM, time series, PCA families, clustering and survival analysis. The package offers a single plotting interface for these analysis results and plots in a unified style using 'ggplot2'.
Maintained by Yuan Tang. Last updated 9 months ago.
2.8 match 529 stars 14.49 score 9.1k scripts 22 dependentsbioc
TargetScore:TargetScore: Infer microRNA targets using microRNA-overexpression data and sequence information
Infer the posterior distributions of microRNA targets by probabilistically modelling the likelihood microRNA-overexpression fold-changes and sequence-based scores. Variaitonal Bayesian Gaussian mixture model (VB-GMM) is applied to log fold-changes and sequence scores to obtain the posteriors of latent variable being the miRNA targets. The final targetScore is computed as the sigmoid-transformed fold-change weighted by the averaged posteriors of target components over all of the features.
Maintained by Yue Li. Last updated 5 months ago.
9.8 match 4.00 score 9 scriptsalarm-redist
redist:Simulation Methods for Legislative Redistricting
Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.
Maintained by Christopher T. Kenny. Last updated 2 months ago.
geospatialgerrymanderingredistrictingsamplingopenblascppopenmp
4.3 match 68 stars 9.17 score 259 scriptsohdsi
OhdsiReportGenerator:Observational Health Data Sciences and Informatics Report Generator
Extract results into R from the Observational Health Data Sciences and Informatics result database (see <https://ohdsi.github.io/Strategus/results-schema/index.html>) and generate reports/presentations via 'quarto' that summarize results in HTML format. Learn more about 'OhdsiReportGenerator' at <https://ohdsi.github.io/OhdsiReportGenerator/>.
Maintained by Jenna Reps. Last updated 17 days ago.
8.6 match 4.54 score 2 scriptsaphalo
ggspectra:Extensions to 'ggplot2' for Radiation Spectra
Additional annotations, stats, geoms and scales for plotting "light" spectra with 'ggplot2', together with specializations of ggplot() and autoplot() methods for spectral data and waveband definitions stored in objects of classes defined in package 'photobiology'. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.
Maintained by Pedro J. Aphalo. Last updated 15 hours ago.
datavizggplot2-autoplotggplot2-enhancementesggplot2-geomsggplot2-scalesggplot2-statslightr4photobiology-suiteradiationspectra
4.8 match 5 stars 8.09 score 390 scripts 1 dependentskwstat
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 26 days ago.
3.5 match 125 stars 11.02 score 1.7k scripts 2 dependentsbioc
crisprVerse:Easily install and load the crisprVerse ecosystem for CRISPR gRNA design
The crisprVerse is a modular ecosystem of R packages developed for the design and manipulation of CRISPR guide RNAs (gRNAs). All packages share a common language and design principles. This package is designed to make it easy to install and load the crisprVerse packages in a single step. To learn more about the crisprVerse, visit <https://www.github.com/crisprVerse>.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsgenetargetcrispr-analysiscrispr-designcrispr-targetgrnagrna-sequencegrna-sequences
7.5 match 13 stars 5.11 score 8 scriptscenterforstatistics-ugent
xnet:Two-Step Kernel Ridge Regression for Network Predictions
Fit a two-step kernel ridge regression model for predicting edges in networks, and carry out cross-validation using shortcuts for swift and accurate performance assessment (Stock et al, 2018 <doi:10.1093/bib/bby095> ).
Maintained by Joris Meys. Last updated 4 years ago.
7.2 match 11 stars 5.30 score 12 scriptsnbarrowman
vtree:Display Information About Nested Subsets of a Data Frame
A tool for calculating and drawing "variable trees". Variable trees display information about nested subsets of a data frame.
Maintained by Nick Barrowman. Last updated 1 years ago.
data-sciencedata-visualizationexploratory-data-analysisstatistics
5.4 match 76 stars 7.09 score 65 scriptsosofr
simcausal:Simulating Longitudinal Data with Causal Inference Applications
A flexible tool for simulating complex longitudinal data using structural equations, with emphasis on problems in causal inference. Specify interventions and simulate from intervened data generating distributions. Define and evaluate treatment-specific means, the average treatment effects and coefficients from working marginal structural models. User interface designed to facilitate the conduct of transparent and reproducible simulation studies, and allows concise expression of complex functional dependencies for a large number of time-varying nodes. See the package vignette for more information, documentation and examples.
Maintained by Oleg Sofrygin. Last updated 8 months ago.
counterfactual-datasemsimulated-networksimulating-datastructural-equations
6.3 match 67 stars 6.06 score 170 scriptsdkyleward
ipfr:List Balancing for Reweighting and Population Synthesis
Performs iterative proportional updating given a seed table and an arbitrary number of marginal distributions. This is commonly used in population synthesis, survey raking, matrix rebalancing, and other applications. For example, a household survey may be weighted to match the known distribution of households by size from the census. An origin/ destination trip matrix might be balanced to match traffic counts. The approach used by this package is based on a paper from Arizona State University (Ye, Xin, et. al. (2009) <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.537.723&rep=rep1&type=pdf>). Some enhancements have been made to their work including primary and secondary target balance/importance, general marginal agreement, and weight restriction.
Maintained by Kyle Ward. Last updated 5 years ago.
7.5 match 5 stars 5.06 score 23 scriptsrstudio
pointblank:Data Validation and Organization of Metadata for Local and Remote Tables
Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Maintained by Richard Iannone. Last updated 8 days ago.
data-assertionsdata-checkerdata-dictionariesdata-framesdata-inferencedata-managementdata-profilerdata-qualitydata-validationdata-verificationdatabase-tableseasy-to-understandreporting-toolschema-validationtesting-toolsyaml-configuration
3.5 match 932 stars 10.59 score 284 scriptsbioc
MIRit:Integrate microRNA and gene expression to decipher pathway complexity
MIRit is an R package that provides several methods for investigating the relationships between miRNAs and genes in different biological conditions. In particular, MIRit allows to explore the functions of dysregulated miRNAs, and makes it possible to identify miRNA-gene regulatory axes that control biological pathways, thus enabling the users to unveil the complexity of miRNA biology. MIRit is an all-in-one framework that aims to help researchers in all the central aspects of an integrative miRNA-mRNA analyses, from differential expression analysis to network characterization.
Maintained by Jacopo Ronchi. Last updated 18 hours ago.
softwaregeneregulationnetworkenrichmentnetworkinferenceepigeneticsfunctionalgenomicssystemsbiologynetworkpathwaysgeneexpressiondifferentialexpressionmirnamirna-mrna-interactionmirna-seqmirnaseq-analysiscpp
9.3 match 4.00 score 2 scriptsusaid-oha-si
grabr:OHA/SI APIs Package
Provides a series of base functions useful to the GH OHA SI team. These function extend the utility functions in glamr, focusing primarily on API utility functions.
Maintained by Aaron Chafetz. Last updated 6 months ago.
7.2 match 1 stars 5.14 score 69 scriptsr-spatial
RSAGA:SAGA Geoprocessing and Terrain Analysis
Provides access to geocomputing and terrain analysis functions of the geographical information system (GIS) 'SAGA' (System for Automated Geoscientific Analyses) from within R by running the command line version of SAGA. This package furthermore provides several R functions for handling ASCII grids, including a flexible framework for applying local functions (including predict methods of fitted models) and focal functions to multiple grids. SAGA GIS is available under GPL-2 / LGPL-2 licences from <https://sourceforge.net/projects/saga-gis/>.
Maintained by Alexander Brenning. Last updated 1 months ago.
4.1 match 23 stars 8.72 score 275 scriptsbioc
DNAfusion:Identification of gene fusions using paired-end sequencing
DNAfusion can identify gene fusions such as EML4-ALK based on paired-end sequencing results. This package was developed using position deduplicated BAM files generated with the AVENIO Oncology Analysis Software. These files are made using the AVENIO ctDNA surveillance kit and Illumina Nextseq 500 sequencing. This is a targeted hybridization NGS approach and includes ALK-specific but not EML4-specific probes.
Maintained by Christoffer Trier Maansson. Last updated 5 months ago.
targetedresequencinggeneticsgenefusiondetectionsequencingbioconductor-packagecirculating-tumor-dnagene-fusionliquid-biopsynext-generation-sequencingtargeted-sequencingvariant-calling
8.0 match 3 stars 4.48 score 10 scriptsbioc
oppti:Outlier Protein and Phosphosite Target Identifier
The aim of oppti is to analyze protein (and phosphosite) expressions to find outlying markers for each sample in the given cohort(s) for the discovery of personalized actionable targets.
Maintained by Abdulkadir Elmas. Last updated 5 months ago.
proteomicsregressiondifferentialexpressionbiomedicalinformaticsgenetargetgeneexpressionnetwork
8.3 match 2 stars 4.30 score 2 scriptsvirgile-baudrot
morse:Modelling Reproduction and Survival Data in Ecotoxicology
Advanced methods for a valuable quantitative environmental risk assessment using Bayesian inference of survival and reproduction Data. Among others, it facilitates Bayesian inference of the general unified threshold model of survival (GUTS). See our companion paper Baudrot and Charles (2021) <doi:10.21105/joss.03200>, as well as complementary details in Baudrot et al. (2018) <doi:10.1021/acs.est.7b05464> and Delignette-Muller et al. (2017) <doi:10.1021/acs.est.6b05326>.
Maintained by Virgile Baudrot. Last updated 6 months ago.
11.0 match 3.26 score 60 scriptsmunterfi
eRTG3D:Empirically Informed Random Trajectory Generation in 3-D
Creates realistic random trajectories in a 3-D space between two given fix points, so-called conditional empirical random walks (CERWs). The trajectory generation is based on empirical distribution functions extracted from observed trajectories (training data) and thus reflects the geometrical movement characteristics of the mover. A digital elevation model (DEM), representing the Earth's surface, and a background layer of probabilities (e.g. food sources, uplift potential, waterbodies, etc.) can be used to influence the trajectories. Unterfinger M (2018). "3-D Trajectory Simulation in Movement Ecology: Conditional Empirical Random Walk". Master's thesis, University of Zurich. <https://www.geo.uzh.ch/dam/jcr:6194e41e-055c-4635-9807-53c5a54a3be7/MasterThesis_Unterfinger_2018.pdf>. Technitis G, Weibel R, Kranstauber B, Safi K (2016). "An algorithm for empirically informed random trajectory generation between two endpoints". GIScience 2016: Ninth International Conference on Geographic Information Science, 9, online. <doi:10.5167/uzh-130652>.
Maintained by Merlin Unterfinger. Last updated 3 years ago.
3dbirdsconditional-empirical-random-walkgliding-and-soaringmachine-learningmovement-ecologyrandom-trajectory-generatorrandom-walksimulationtrajectory-generation
6.3 match 6 stars 5.71 score 19 scriptsnhejazi
medoutcon:Efficient Natural and Interventional Causal Mediation Analysis
Efficient estimators of interventional (in)direct effects in the presence of mediator-outcome confounding affected by exposure. The effects estimated allow for the impact of the exposure on the outcome through a direct path to be disentangled from that through mediators, even in the presence of intermediate confounders that complicate such a relationship. Currently supported are non-parametric efficient one-step and targeted minimum loss estimators based on the formulation of Dรญaz, Hejazi, Rudolph, and van der Laan (2020) <doi:10.1093/biomet/asaa085>. Support for efficient estimation of the natural (in)direct effects is also provided, appropriate for settings in which intermediate confounders are absent. The package also supports estimation of these effects when the mediators are measured using outcome-dependent two-phase sampling designs (e.g., case-cohort).
Maintained by Nima Hejazi. Last updated 1 years ago.
causal-inferencecausal-machine-learninginverse-probability-weightsmachine-learningmediation-analysisstochastic-interventionstargeted-learningtreatment-effects
8.0 match 13 stars 4.46 score 22 scriptsbioc
crisprBwa:BWA-based alignment of CRISPR gRNA spacer sequences
Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bwa. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Currently not supported on Windows machines.
Maintained by Jean-Philippe Fortin. Last updated 5 months ago.
crisprfunctionalgenomicsalignmentalignerbioconductorbioconductor-packagebwacrispr-analysiscrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequencessgrnasgrna-design
8.3 match 1 stars 4.30 score 6 scriptsngreifer
optweight:Targeted Stable Balancing Weights Using Optimization
Use optimization to estimate weights that balance covariates for binary, multinomial, and continuous treatments in the spirit of Zubizarreta (2015) <doi:10.1080/01621459.2015.1023805>. The degree of balance can be specified for each covariate. In addition, sampling weights can be estimated that allow a sample to generalize to a population specified with given target moments of covariates.
Maintained by Noah Greifer. Last updated 2 years ago.
causal-inferenceinverse-probability-weightsobservational-studyoptimizationpropensity-scores
9.4 match 8 stars 3.78 score 15 scriptsbioc
spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions
The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.
Maintained by Jianhai Zhang. Last updated 4 months ago.
spatialvisualizationmicroarraysequencinggeneexpressiondatarepresentationnetworkclusteringgraphandnetworkcellbasedassaysatacseqdnaseqtissuemicroarraysinglecellcellbiologygenetarget
5.6 match 5 stars 6.26 score 12 scriptsbioc
PanomiR:Detection of miRNAs that regulate interacting groups of pathways
PanomiR is a package to detect miRNAs that target groups of pathways from gene expression data. This package provides functionality for generating pathway activity profiles, determining differentially activated pathways between user-specified conditions, determining clusters of pathways via the PCxN package, and generating miRNAs targeting clusters of pathways. These function can be used separately or sequentially to analyze RNA-Seq data.
Maintained by Pourya Naderi. Last updated 5 months ago.
geneexpressiongenesetenrichmentgenetargetmirnapathways
7.2 match 3 stars 4.89 score 13 scriptsbioc
GeomxTools:NanoString GeoMx Tools
Tools for NanoString Technologies GeoMx Technology. Package provides functions for reading in DCC and PKC files based on an ExpressionSet derived object. Normalization and QC functions are also included.
Maintained by Maddy Griswold. Last updated 5 months ago.
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsmrnamicroarrayproprietaryplatformsrnaseqsequencingexperimentaldesignnormalizationspatial
4.9 match 7.11 score 239 scripts 3 dependentsnlmixr2
nlmixr2targets:Targets for 'nlmixr2' Pipelines
'nlmixr2' often has long runtimes. A pipeline toolkit tailored to 'nlmixr2' workflows leverages 'targets' and 'nlmixr2' to ease reproducible workflows. 'nlmixr2targets' ensures minimal rework in model development with 'nlmixr2' and 'targets' by simplifying and standardizing models and datasets.
Maintained by Bill Denney. Last updated 20 days ago.
10.9 match 3.18 score 6 scriptsmlopez-ibanez
irace:Iterated Racing for Automatic Algorithm Configuration
Iterated race is an extension of the Iterated F-race method for the automatic configuration of optimization algorithms, that is, (offline) tuning their parameters by finding the most appropriate settings given a set of instances of an optimization problem. M. Lรณpez-Ibรกรฑez, J. Dubois-Lacoste, L. Pรฉrez Cรกceres, T. Stรผtzle, and M. Birattari (2016) <doi:10.1016/j.orp.2016.09.002>.
Maintained by Manuel Lรณpez-Ibรกรฑez. Last updated 29 days ago.
algorithm-configurationhyperparameter-tuningiraceoptimization-algorithms
3.4 match 63 stars 10.28 score 103 scripts 1 dependentskgoldfeld
simstudy:Simulation of Study Data
Simulates data sets in order to explore modeling techniques or better understand data generating processes. The user specifies a set of relationships between covariates, and generates data based on these specifications. The final data sets can represent data from randomized control trials, repeated measure (longitudinal) designs, and cluster randomized trials. Missingness can be generated using various mechanisms (MCAR, MAR, NMAR).
Maintained by Keith Goldfeld. Last updated 8 months ago.
data-generationdata-simulationsimulationstatistical-modelscpp
3.1 match 82 stars 11.00 score 972 scripts 1 dependentsnatverse
nat:NeuroAnatomy Toolbox for Analysis of 3D Image Data
NeuroAnatomy Toolbox (nat) enables analysis and visualisation of 3D biological image data, especially traced neurons. Reads and writes 3D images in NRRD and 'Amira' AmiraMesh formats and reads surfaces in 'Amira' hxsurf format. Traced neurons can be imported from and written to SWC and 'Amira' LineSet and SkeletonGraph formats. These data can then be visualised in 3D via 'rgl', manipulated including applying calculated registrations, e.g. using the 'CMTK' registration suite, and analysed. There is also a simple representation for neurons that have been subjected to 3D skeletonisation but not formally traced; this allows morphological comparison between neurons including searches and clustering (via the 'nat.nblast' extension package).
Maintained by Gregory Jefferis. Last updated 5 months ago.
3dconnectomicsimage-analysisneuroanatomyneuroanatomy-toolboxneuronneuron-morphologyneurosciencevisualisation
3.4 match 67 stars 9.94 score 436 scripts 2 dependentsbioc
NTW:Predict gene network using an Ordinary Differential Equation (ODE) based method
This package predicts the gene-gene interaction network and identifies the direct transcriptional targets of the perturbation using an ODE (Ordinary Differential Equation) based method.
Maintained by Yuanhua Liu. Last updated 5 months ago.
9.0 match 3.78 score 1 scriptsbioc
sSeq:Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size
The purpose of this package is to discover the genes that are differentially expressed between two conditions in RNA-seq experiments. Gene expression is measured in counts of transcripts and modeled with the Negative Binomial (NB) distribution using a shrinkage approach for dispersion estimation. The method of moment (MM) estimates for dispersion are shrunk towards an estimated target, which minimizes the average squared difference between the shrinkage estimates and the initial estimates. The exact per-gene probability under the NB model is calculated, and used to test the hypothesis that the expected expression of a gene in two conditions identically follow a NB distribution.
Maintained by Danni Yu. Last updated 5 months ago.
6.8 match 4.98 score 4 scripts 2 dependentsinsightsengineering
teal:Exploratory Web Apps for Analyzing Clinical Trials Data
A 'shiny' based interactive exploration framework for analyzing clinical trials data. 'teal' currently provides a dynamic filtering facility and different data viewers. 'teal' 'shiny' applications are built using standard 'shiny' modules.
Maintained by Dawid Kaledkowski. Last updated 19 days ago.
clinical-trialsnestshinywebapp
2.7 match 197 stars 12.68 score 176 scripts 5 dependentsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 7 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
4.2 match 105 stars 7.98 scoretlverse
tmle3mediate:Targeted Learning for Causal Mediation Analysis
Targeted maximum likelihood (TML) estimation of population-level causal effects in mediation analysis. The causal effects are defined by joint static or stochastic interventions applied to the exposure and the mediator. Targeted doubly robust estimators are provided for the classical natural direct and indirect effects, as well as the more recently developed population intervention direct and indirect effects.
Maintained by Nima Hejazi. Last updated 4 years ago.
causal-inferencecausal-mediation-analysismachine-learningmediation-analysisstochastic-interventionstargeted-learningtreatment-effects
11.3 match 6 stars 2.98 score 16 scriptszarquon42b
Rvcg:Manipulations of Triangular Meshes Based on the 'VCGLIB' API
Operations on triangular meshes based on 'VCGLIB'. This package integrates nicely with the R-package 'rgl' to render the meshes processed by 'Rvcg'. The Visualization and Computer Graphics Library (VCG for short) is an open source portable C++ templated library for manipulation, processing and displaying with OpenGL of triangle and tetrahedral meshes. The library, composed by more than 100k lines of code, is released under the GPL license, and it is the base of most of the software tools of the Visual Computing Lab of the Italian National Research Council Institute ISTI <https://vcg.isti.cnr.it/>, like 'metro' and 'MeshLab'. The 'VCGLIB' source is pulled from trunk <https://github.com/cnr-isti-vclab/vcglib> and patched to work with options determined by the configure script as well as to work with the header files included by 'RcppEigen'.
Maintained by Stefan Schlager. Last updated 5 months ago.
3.4 match 25 stars 10.05 score 195 scripts 29 dependentsblasbenito
collinear:Automated Multicollinearity Management
Effortless multicollinearity management in data frames with both numeric and categorical variables for statistical and machine learning applications. The package simplifies multicollinearity analysis by combining four robust methods: 1) target encoding for categorical variables (Micci-Barreca, D. 2001 <doi:10.1145/507533.507538>); 2) automated feature prioritization to prevent key variable loss during filtering; 3) pairwise correlation for all variable combinations (numeric-numeric, numeric-categorical, categorical-categorical); and 4) fast computation of variance inflation factors.
Maintained by Blas M. Benito. Last updated 2 months ago.
machine-learningmulticollinearitystatistics
6.1 match 11 stars 5.51 score 15 scripts 1 dependentshubverse-org
hubExamples:Example Hub Data
This package provides example data for forecasting and scenario modeling hubs in the hubverse format.
Maintained by Evan L Ray. Last updated 2 months ago.
6.2 match 1 stars 5.46 score 20 scripts 1 dependentsrstudio
DT:A Wrapper of the JavaScript Library 'DataTables'
Data objects in R can be rendered as HTML tables using the JavaScript library 'DataTables' (typically via R Markdown or Shiny). The 'DataTables' library has been included in this R package. The package name 'DT' is an abbreviation of 'DataTables'.
Maintained by Joe Cheng. Last updated 3 months ago.
datatableshtmlwidgetsjavascriptshiny
1.8 match 604 stars 19.15 score 38k scripts 673 dependentssfilges
umiAnalyzer:Tools for Analyzing Sequencing Data with Unique Molecular Identifiers
Tools for analyzing sequencing data containing unique molecular identifiers generated by 'UMIErrorCorrect' (<https://github.com/stahlberggroup/umierrorcorrect>).
Maintained by Stefan Filges. Last updated 3 years ago.
targeted-sequencingunique-molecular-identifiersvariant-analysis
7.5 match 4.46 score 58 scriptsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 25 days ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
2.3 match 305 stars 14.45 score 1.6k scripts 6 dependentsbioc
rTRM:Identification of Transcriptional Regulatory Modules from Protein-Protein Interaction Networks
rTRM identifies transcriptional regulatory modules (TRMs) from protein-protein interaction networks.
Maintained by Diego Diez. Last updated 5 months ago.
transcriptionnetworkgeneregulationgraphandnetworkbioconductorbioinformatics
6.9 match 3 stars 4.86 score 3 scripts 1 dependentsbioc
epiregulon:Gene regulatory network inference from single cell epigenomic data
Gene regulatory networks model the underlying gene regulation hierarchies that drive gene expression and observed phenotypes. Epiregulon infers TF activity in single cells by constructing a gene regulatory network (regulons). This is achieved through integration of scATAC-seq and scRNA-seq data and incorporation of public bulk TF ChIP-seq data. Links between regulatory elements and their target genes are established by computing correlations between chromatin accessibility and gene expressions.
Maintained by Xiaosai Yao. Last updated 5 days ago.
singlecellgeneregulationnetworkinferencenetworkgeneexpressiontranscriptiongenetargetcpp
5.0 match 14 stars 6.67 score 17 scriptsbioc
aroma.light:Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Maintained by Henrik Bengtsson. Last updated 5 months ago.
infrastructuremicroarrayonechanneltwochannelmultichannelvisualizationpreprocessingbioconductor
5.1 match 1 stars 6.43 score 26 scripts 20 dependents