Showing 200 of total 387 results (show query)
bioc
YAPSA:Yet Another Package for Signature Analysis
This package provides functions and routines for supervised analyses of mutational signatures (i.e., the signatures have to be known, cf. L. Alexandrov et al., Nature 2013 and L. Alexandrov et al., Bioaxiv 2018). In particular, the family of functions LCD (LCD = linear combination decomposition) can use optimal signature-specific cutoffs which takes care of different detectability of the different signatures. Moreover, the package provides different sets of mutational signatures, including the COSMIC and PCAWG SNV signatures and the PCAWG Indel signatures; the latter infering that with YAPSA, the concept of supervised analysis of mutational signatures is extended to Indel signatures. YAPSA also provides confidence intervals as computed by profile likelihoods and can perform signature analysis on a stratified mutational catalogue (SMC = stratify mutational catalogue) in order to analyze enrichment and depletion patterns for the signatures in different strata.
Maintained by Zuguang Gu. Last updated 5 months ago.
sequencingdnaseqsomaticmutationvisualizationclusteringgenomicvariationstatisticalmethodbiologicalquestion
77.2 match 6.41 score 57 scriptsbioc
maftools:Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Maintained by Anand Mayakonda. Last updated 5 months ago.
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
25.0 match 459 stars 14.63 score 948 scripts 18 dependentstidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 12 days ago.
14.3 match 4.8k stars 24.68 score 659k scripts 7.8k dependentsbioc
HiLDA:Conducting statistical inference on comparing the mutational exposures of mutational signatures by using hierarchical latent Dirichlet allocation
A package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation. It statistically tests whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups. The package also provides inference and visualization.
Maintained by Zhi Yang. Last updated 5 months ago.
softwaresomaticmutationsequencingstatisticalmethodbayesianmutational-signaturesrjagssomatic-mutationscppjags
54.0 match 3 stars 5.56 score 7 scripts 1 dependentsjakobbossek
ecr:Evolutionary Computation in R
Framework for building evolutionary algorithms for both single- and multi-objective continuous or discrete optimization problems. A set of predefined evolutionary building blocks and operators is included. Moreover, the user can easily set up custom objective functions, operators, building blocks and representations sticking to few conventions. The package allows both a black-box approach for standard tasks (plug-and-play style) and a much more flexible white-box approach where the evolutionary cycle is written by hand.
Maintained by Jakob Bossek. Last updated 1 years ago.
combinatorial-optimizationevolutionary-algorithmevolutionary-algorithmsevolutionary-strategygenetic-algorithm-frameworkmetaheuristicsmulti-objective-optimizationoptimizationoptimization-frameworkcpp
25.3 match 43 stars 7.36 score 89 scripts 2 dependentsbioc
canceR:A Graphical User Interface for accessing and modeling the Cancer Genomics Data of MSKCC
The package is user friendly interface based on the cgdsr and other modeling packages to explore, compare, and analyse all available Cancer Data (Clinical data, Gene Mutation, Gene Methylation, Gene Expression, Protein Phosphorylation, Copy Number Alteration) hosted by the Computational Biology Center at Memorial-Sloan-Kettering Cancer Center (MSKCC).
Maintained by Karim Mezhoud. Last updated 5 months ago.
guigeneexpressionclusteringgogenesetenrichmentkeggmultiplecomparisoncancercancer-datagenegene-expressiongene-methylationgene-mutationgene-setsmethylationmskccmutationstcltk
34.8 match 7 stars 5.25 score 17 scriptsmsq-123
CovidMutations:Mutation Analysis and Assay Validation Toolkit for COVID-19 (Coronavirus Disease 2019)
A feasible framework for mutation analysis and reverse transcription polymerase chain reaction (RT-PCR) assay evaluation of COVID-19, including mutation profile visualization, statistics and mutation ratio of each assay. The mutation ratio is conducive to evaluating the coverage of RT-PCR assays in large-sized samples<doi:10.20944/preprints202004.0529.v1>.
Maintained by Shaoqian Ma. Last updated 5 years ago.
41.2 match 4 stars 4.30 score 6 scriptsshixiangwang
sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations
Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called 'signature' (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.
Maintained by Shixiang Wang. Last updated 5 months ago.
bayesian-nmfbioinformaticscancer-researchcnvcopynumber-signaturescosmic-signaturesdbseasy-to-useindelmutational-signaturesnmfnmf-extractionsbssignature-extractionsomatic-mutationssomatic-variantsvisualizationcpp
17.9 match 150 stars 9.48 score 123 scripts 2 dependentshneth
riskyr:Rendering Risk Literacy more Transparent
Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.
Maintained by Hansjoerg Neth. Last updated 10 months ago.
2x2-matrixbayesian-inferencecontingency-tablerepresentationriskrisk-literacyvisualization
20.6 match 19 stars 7.36 score 80 scriptsbioc
Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data
The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.
Maintained by Matteo Tiberti. Last updated 2 months ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment
22.7 match 5 stars 6.59 score 43 scriptsbusiness-science
tidyquant:Tidy Quantitative Financial Analysis
Bringing business and financial analysis to the 'tidyverse'. The 'tidyquant' package provides a convenient wrapper to various 'xts', 'zoo', 'quantmod', 'TTR' and 'PerformanceAnalytics' package functions and returns the objects in the tidy 'tibble' format. The main advantage is being able to use quantitative functions with the 'tidyverse' functions including 'purrr', 'dplyr', 'tidyr', 'ggplot2', 'lubridate', etc. See the 'tidyquant' website for more information, documentation and examples.
Maintained by Matt Dancho. Last updated 1 months ago.
dplyrfinancial-analysisfinancial-datafinancial-statementsmultiple-stocksperformance-analysisperformanceanalyticsquantmodstockstock-exchangesstock-indexesstock-listsstock-performancestock-pricesstock-symboltidyversetime-seriestimeseriesxts
11.2 match 872 stars 13.34 score 5.2k scriptsmano-b
MicroSEC:Sequence Error Filter for Formalin-Fixed and Paraffin-Embedded Samples
Clinical sequencing of tumor is usually performed on formalin-fixed and paraffin-embedded samples and have many sequencing errors. We found that the majority of these errors are detected in chimeric read caused by single-strand DNA with micro-homology. Our filtering pipeline focuses on the uneven distribution of the artifacts in each read and removes such errors in formalin-fixed and paraffin-embedded samples without over-eliminating the true mutations detected in fresh frozen samples.
Maintained by Masachika Ikegami. Last updated 3 months ago.
25.4 match 7 stars 5.66 score 8 scriptsrozen-lab
cosmicsig:Mutational Signatures from COSMIC (Catalogue of Somatic Mutations in Cancer)
A data package with 2 main package variables: 'signature' and 'etiology'. The 'signature' variable contains the latest mutational signature profiles released on COSMIC <https://cancer.sanger.ac.uk/signatures/> for 3 mutation types: * Single base substitutions in the context of preceding and following bases, * Doublet base substitutions, and * Small insertions and deletions. The 'etiology' variable provides the known or hypothesized causes of signatures. 'cosmicsig' stands for COSMIC signatures. Please run ?'cosmicsig' for more information.
Maintained by Steven Rozen. Last updated 2 years ago.
41.3 match 1 stars 3.04 score 22 scriptsg3viz
g3viz:Interactively Visualize Genetic Mutation Data using a Lollipop-Diagram
Interface for 'g3-lollipop' 'JavaScript' library. Visualize genetic mutation data using an interactive lollipop diagram in 'RStudio' or your web browser.
Maintained by Xin Guo. Last updated 6 months ago.
bioinformaticsgenomics-visualizationlollipop-plotvariantsvisualize-mutation-data
21.7 match 31 stars 5.61 score 22 scriptsbioc
tidySummarizedExperiment:Brings SummarizedExperiment to the Tidyverse
The tidySummarizedExperiment package provides a set of tools for creating and manipulating tidy data representations of SummarizedExperiment objects. SummarizedExperiment is a widely used data structure in bioinformatics for storing high-throughput genomic data, such as gene expression or DNA sequencing data. The tidySummarizedExperiment package introduces a tidy framework for working with SummarizedExperiment objects. It allows users to convert their data into a tidy format, where each observation is a row and each variable is a column. This tidy representation simplifies data manipulation, integration with other tidyverse packages, and enables seamless integration with the broader ecosystem of tidy tools for data analysis.
Maintained by Stefano Mangiola. Last updated 5 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomics
13.5 match 26 stars 8.44 score 196 scripts 1 dependentsbioc
sitePath:Phylogeny-based sequence clustering with site polymorphism
Using site polymorphism is one of the ways to cluster DNA/protein sequences but it is possible for the sequences with the same polymorphism on a single site to be genetically distant. This package is aimed at clustering sequences using site polymorphism and their corresponding phylogenetic trees. By considering their location on the tree, only the structurally adjacent sequences will be clustered. However, the adjacent sequences may not necessarily have the same polymorphism. So a branch-and-bound like algorithm is used to minimize the entropy representing the purity of site polymorphism of each cluster.
Maintained by Chengyang Ji. Last updated 5 months ago.
alignmentmultiplesequencealignmentphylogeneticssnpsoftwaremutationcpp
21.6 match 8 stars 5.20 score 9 scriptsbioc
decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting
Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.
Maintained by Rosario M. Piro. Last updated 5 months ago.
softwaresnpsequencingdnaseqgenomicvariationsomaticmutationbiomedicalinformaticsgeneticsbiologicalquestionstatisticalmethod
23.5 match 1 stars 4.78 score 10 scripts 1 dependentsbioc
musicatk:Mutational Signature Comprehensive Analysis Toolkit
Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.
Maintained by Joshua D. Campbell. Last updated 5 months ago.
softwarebiologicalquestionsomaticmutationvariantannotation
15.7 match 13 stars 7.02 score 20 scriptsrich-iannone
DiagrammeR:Graph/Network Visualization
Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.
Maintained by Richard Iannone. Last updated 2 months ago.
graphgraph-functionsnetwork-graphproperty-graphvisualization
7.1 match 1.7k stars 15.18 score 3.8k scripts 87 dependentsssnn-airr
shazam:Immunoglobulin Somatic Hypermutation Analysis
Provides a computational framework for analyzing mutations in immunoglobulin (Ig) sequences. Includes methods for Bayesian estimation of antigen-driven selection pressure, mutational load quantification, building of somatic hypermutation (SHM) models, and model-dependent distance calculations. Also includes empirically derived models of SHM for both mice and humans. Citations: Gupta and Vander Heiden, et al (2015) <doi:10.1093/bioinformatics/btv359>, Yaari, et al (2012) <doi:10.1093/nar/gks457>, Yaari, et al (2013) <doi:10.3389/fimmu.2013.00358>, Cui, et al (2016) <doi:10.4049/jimmunol.1502263>.
Maintained by Susanna Marquez. Last updated 2 months ago.
14.5 match 7.43 score 222 scripts 2 dependentsmagnusdv
pedmut:Mutation Models for Pedigree Likelihood Computations
A collection of functions for modelling mutations in pedigrees with marker data, as used e.g. in likelihood computations with microsatellite data. Implemented models include equal, proportional and stepwise models, as well as random models for experimental work, and custom models allowing the user to apply any valid mutation matrix. Allele lumping is done following the lumpability criteria of Kemeny and Snell (1976), ISBN:0387901922.
Maintained by Magnus Dehli Vigeland. Last updated 1 years ago.
21.6 match 2 stars 4.76 score 5 scripts 19 dependentsbioc
PureCN:Copy number calling and SNV classification using targeted short read sequencing
This package estimates tumor purity, copy number, and loss of heterozygosity (LOH), and classifies single nucleotide variants (SNVs) by somatic status and clonality. PureCN is designed for targeted short read sequencing data, integrates well with standard somatic variant detection and copy number pipelines, and has support for tumor samples without matching normal samples.
Maintained by Markus Riester. Last updated 2 months ago.
copynumbervariationsoftwaresequencingvariantannotationvariantdetectioncoverageimmunooncologybioconductor-packagecell-free-dnacopy-numberlohtumor-heterogeneitytumor-mutational-burdentumor-purity
10.5 match 132 stars 9.72 score 40 scriptsbioc
selectKSigs:Selecting the number of mutational signatures using a perplexity-based measure and cross-validation
A package to suggest the number of mutational signatures in a collection of somatic mutations using calculating the cross-validated perplexity score.
Maintained by Zhi Yang. Last updated 5 months ago.
softwaresomaticmutationsequencingstatisticalmethodclusteringmutational-signaturesrjagssomatic-mutationscppjags
24.7 match 3 stars 4.08 score 1 scriptsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 26 days ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
6.8 match 305 stars 14.45 score 1.6k scripts 6 dependentshanjunwei-lab
ssMutPA:Single-Sample Mutation-Based Pathway Analysis
A systematic bioinformatics tool to perform single-sample mutation-based pathway analysis by integrating somatic mutation data with the Protein-Protein Interaction (PPI) network. In this method, we use local and global weighted strategies to evaluate the effects of network genes from mutations according to the network topology and then calculate the mutation-based pathway enrichment score (ssMutPES) to reflect the accumulated effect of mutations of each pathway. Subsequently, the ssMutPES profiles are used for unsupervised spectral clustering to identify cancer subtypes.
Maintained by Junwei Han. Last updated 5 months ago.
23.9 match 4.00 score 9 scriptsbioc
RaggedExperiment:Representation of Sparse Experiments and Assays Across Samples
This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.
Maintained by Marcel Ramos. Last updated 4 months ago.
infrastructuredatarepresentationcopynumbercore-packagedata-structuremutationsu24ca289073
10.5 match 4 stars 8.96 score 76 scripts 15 dependentsbioc
CaMutQC:An R Package for Comprehensive Filtration and Selection of Cancer Somatic Mutations
CaMutQC is able to filter false positive mutations generated due to technical issues, as well as to select candidate cancer mutations through a series of well-structured functions by labeling mutations with various flags. And a detailed and vivid filter report will be offered after completing a whole filtration or selection section. Also, CaMutQC integrates serveral methods and gene panels for Tumor Mutational Burden (TMB) estimation.
Maintained by Xin Wang. Last updated 5 months ago.
softwarequalitycontrolgenetargetcancer-genomicssomatic-mutations
15.8 match 7 stars 5.92 score 1 scriptsbioc
mitoClone2:Clonal Population Identification in Single-Cell RNA-Seq Data using Mitochondrial and Somatic Mutations
This package primarily identifies variants in mitochondrial genomes from BAM alignment files. It filters these variants to remove RNA editing events then estimates their evolutionary relationship (i.e. their phylogenetic tree) and groups single cells into clones. It also visualizes the mutations and providing additional genomic context.
Maintained by Benjamin Story. Last updated 5 months ago.
annotationdataimportgeneticssnpsoftwaresinglecellalignmentcurlbzip2xz-utilszlibcpp
20.8 match 1 stars 4.48 score 9 scriptscran
podcleaner:Legacy Scottish Post Office Directories Cleaner
Attempts to clean optical character recognition (OCR) errors in legacy Scottish Post Office Directories. Further attempts to match records from trades and general directories.
Maintained by Olivier Bautheac. Last updated 3 years ago.
54.0 match 1.70 scoresparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 9 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
6.0 match 959 stars 15.16 score 4.0k scripts 21 dependentsstemangiola
tidyseurat:Brings Seurat to the Tidyverse
It creates an invisible layer that allow to see the 'Seurat' object as tibble and interact seamlessly with the tidyverse.
Maintained by Stefano Mangiola. Last updated 8 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsdplyrggplot2pcapurrrsctseuratsingle-cellsingle-cell-rna-seqtibbletidyrtidyversetranscriptstsneumap
9.0 match 158 stars 9.66 score 398 scripts 1 dependentshadley
plyr:Tools for Splitting, Applying and Combining Data
A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.
Maintained by Hadley Wickham. Last updated 4 months ago.
4.7 match 500 stars 18.16 score 83k scripts 3.3k dependentsbioc
tidySingleCellExperiment:Brings SingleCellExperiment to the Tidyverse
'tidySingleCellExperiment' is an adapter that abstracts the 'SingleCellExperiment' container in the form of a 'tibble'. This allows *tidy* data manipulation, nesting, and plotting. For example, a 'tidySingleCellExperiment' is directly compatible with functions from 'tidyverse' packages `dplyr` and `tidyr`, as well as plotting with `ggplot2` and `plotly`. In addition, the package provides various utility functions specific to single-cell omics data analysis (e.g., aggregation of cell-level data to pseudobulks).
Maintained by Stefano Mangiola. Last updated 5 months ago.
assaydomaininfrastructurernaseqdifferentialexpressionsinglecellgeneexpressionnormalizationclusteringqualitycontrolsequencingbioconductordplyrggplot2plotlysingle-cell-rna-seqsingle-cell-sequencingsinglecellexperimenttibbletidyrtidyverse
9.0 match 36 stars 8.86 score 125 scripts 2 dependentsbioc
iPAC:Identification of Protein Amino acid Clustering
iPAC is a novel tool to identify somatic amino acid mutation clustering within proteins while taking into account protein structure.
Maintained by Gregory Ryslik. Last updated 2 days ago.
14.2 match 5.56 score 4 scripts 3 dependentsbioc
SparseSignatures:SparseSignatures
Point mutations occurring in a genome can be divided into 96 categories based on the base being mutated, the base it is mutated into and its two flanking bases. Therefore, for any patient, it is possible to represent all the point mutations occurring in that patient's tumor as a vector of length 96, where each element represents the count of mutations for a given category in the patient. A mutational signature represents the pattern of mutations produced by a mutagen or mutagenic process inside the cell. Each signature can also be represented by a vector of length 96, where each element represents the probability that this particular mutagenic process generates a mutation of the 96 above mentioned categories. In this R package, we provide a set of functions to extract and visualize the mutational signatures that best explain the mutation counts of a large number of patients.
Maintained by Luca De Sano. Last updated 5 months ago.
biomedicalinformaticssomaticmutation
12.0 match 11 stars 6.42 score 4 scriptszhangrenl
geneHapR:Gene Haplotype Statistics, Phenotype Association and Visualization
Import genome variants data and perform gene haplotype Statistics, visualization and phenotype association with 'R'.
Maintained by Zhang Renliang. Last updated 6 months ago.
nucleosomepositioningdataimport
14.6 match 13 stars 5.11 score 8 scriptsr-forge
ROI:R Optimization Infrastructure
The R Optimization Infrastructure ('ROI') <doi:10.18637/jss.v094.i15> is a sophisticated framework for handling optimization problems in R. Additional information can be found on the 'ROI' homepage <http://roi.r-forge.r-project.org/>.
Maintained by Stefan Theussl. Last updated 2 years ago.
9.4 match 7.68 score 506 scripts 47 dependentsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 8 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
8.9 match 105 stars 7.98 scorebioc
DriverNet:Drivernet: uncovering somatic driver mutations modulating transcriptional networks in cancer
DriverNet is a package to predict functional important driver genes in cancer by integrating genome data (mutation and copy number variation data) and transcriptome data (gene expression data). The different kinds of data are combined by an influence graph, which is a gene-gene interaction network deduced from pathway data. A greedy algorithm is used to find the possible driver genes, which may mutated in a larger number of patients and these mutations will push the gene expression values of the connected genes to some extreme values.
Maintained by Jiarui Ding. Last updated 5 months ago.
16.4 match 4.30 score 7 scriptsbioc
signeR:Empirical Bayesian approach to mutational signature discovery
The signeR package provides an empirical Bayesian approach to mutational signature discovery. It is designed to analyze single nucleotide variation (SNV) counts in cancer genomes, but can also be applied to other features as well. Functionalities to characterize signatures or genome samples according to exposure patterns are also provided.
Maintained by Renan Valieris. Last updated 5 months ago.
genomicvariationsomaticmutationstatisticalmethodvisualizationbioconductorbioinformaticsopenblascpp
9.2 match 13 stars 7.67 score 22 scriptsbioc
GenomAutomorphism:Compute the automorphisms between DNA's Abelian group representations
This is a R package to compute the automorphisms between pairwise aligned DNA sequences represented as elements from a Genomic Abelian group. In a general scenario, from genomic regions till the whole genomes from a given population (from any species or close related species) can be algebraically represented as a direct sum of cyclic groups or more specifically Abelian p-groups. Basically, we propose the representation of multiple sequence alignments of length N bp as element of a finite Abelian group created by the direct sum of homocyclic Abelian group of prime-power order.
Maintained by Robersy Sanchez. Last updated 3 months ago.
mathematicalbiologycomparativegenomicsfunctionalgenomicsmultiplesequencealignmentwholegenomegenetic-codegenetic-code-algebragenomegenome-algebra
16.1 match 4.30 score 9 scriptsopenbiox
UCSCXenaShiny:Interactive Analysis of UCSC Xena Data
Provides functions and a Shiny application for downloading, analyzing and visualizing datasets from UCSC Xena (<http://xena.ucsc.edu/>), which is a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others.
Maintained by Shixiang Wang. Last updated 4 months ago.
cancer-datasetshiny-appsucsc-xena
8.1 match 96 stars 8.54 score 35 scriptselbersb
tidylog:Logging for 'dplyr' and 'tidyr' Functions
Provides feedback about 'dplyr' and 'tidyr' operations.
Maintained by Benjamin Elbers. Last updated 9 months ago.
dplyrtidyrtidyversewrapper-functions
6.7 match 593 stars 10.23 score 1.7k scriptstalegari
tidier:Enhanced 'mutate'
Provides 'Apache Spark' style window aggregation for R dataframes and remote 'dbplyr' tables via 'mutate' in 'dplyr' flavour.
Maintained by Srikanth Komala Sheshachala. Last updated 2 years ago.
dbplyrdplyrmutatespark-sqltidyverse
21.3 match 3 stars 3.18 score 1 scriptsbioc
bioCancer:Interactive Multi-Omics Cancers Data Visualization and Analysis
This package is a Shiny App to visualize and analyse interactively Multi-Assays of Cancer Genomic Data.
Maintained by Karim Mezhoud. Last updated 5 months ago.
guidatarepresentationnetworkmultiplecomparisonpathwaysreactomevisualizationgeneexpressiongenetargetanalysisbiocancer-interfacecancercancer-studiesrmarkdown
11.2 match 20 stars 5.95 score 7 scriptsdieghernan
tidyterra:'tidyverse' Methods and 'ggplot2' Helpers for 'terra' Objects
Extension of the 'tidyverse' for 'SpatRaster' and 'SpatVector' objects of the 'terra' package. It includes also new 'geom_' functions that provide a convenient way of visualizing 'terra' objects with 'ggplot2'.
Maintained by Diego Hernangรณmez. Last updated 8 hours ago.
terraggplot-extensionr-spatialrspatial
4.9 match 191 stars 13.62 score 1.9k scripts 25 dependentsbioc
SigsPack:Mutational Signature Estimation for Single Samples
Single sample estimation of exposure to mutational signatures. Exposures to known mutational signatures are estimated for single samples, based on quadratic programming algorithms. Bootstrapping the input mutational catalogues provides estimations on the stability of these exposures. The effect of the sequence composition of mutational context can be taken into account by normalising the catalogues.
Maintained by Franziska Schumann. Last updated 5 months ago.
somaticmutationsnpvariantannotationbiomedicalinformaticsdnaseq
15.3 match 2 stars 4.30 score 4 scriptsbdj34
cloneRate:Estimate Growth Rates from Phylogenetic Trees
Quickly estimate the net growth rate of a population or clone whose growth can be approximated by a birth-death branching process. Input should be phylogenetic tree(s) of clone(s) with edge lengths corresponding to either time or mutations. Based on coalescent results in Johnson et al. (2023) <doi:10.1093/bioinformatics/btad561>. Simulation techniques as well as growth rate methods build on prior work from Lambert A. (2018) <doi:10.1016/j.tpb.2018.04.005> and Stadler T. (2009) <doi:10.1016/j.jtbi.2009.07.018>.
Maintained by Brian Johnson. Last updated 11 months ago.
13.0 match 4 stars 4.90 score 8 scriptsbilldenney
PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis
Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.
Maintained by Bill Denney. Last updated 16 days ago.
ncanoncompartmental-analysispharmacokinetics
4.8 match 73 stars 12.61 score 214 scripts 4 dependentsbioc
SomaticSignatures:Somatic Signatures
The SomaticSignatures package identifies mutational signatures of single nucleotide variants (SNVs). It provides a infrastructure related to the methodology described in Nik-Zainal (2012, Cell), with flexibility in the matrix decomposition algorithms.
Maintained by Julian Gehring. Last updated 5 months ago.
sequencingsomaticmutationvisualizationclusteringgenomicvariationstatisticalmethod
8.8 match 22 stars 6.85 score 54 scripts 1 dependentsstan-dev
posterior:Tools for Working with Posterior Distributions
Provides useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to: (a) Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions. (b) Provide consistent methods for operations commonly performed on draws, for example, subsetting, binding, or mutating draws. (c) Provide various summaries of draws in convenient formats. (d) Provide lightweight implementations of state of the art posterior inference diagnostics. References: Vehtari et al. (2021) <doi:10.1214/20-BA1221>.
Maintained by Paul-Christian Bรผrkner. Last updated 10 days ago.
3.7 match 168 stars 16.13 score 3.3k scripts 342 dependentsbioc
scMitoMut:Single-cell Mitochondrial Mutation Analysis Tool
This package is designed for calling lineage-informative mitochondrial mutations using single-cell sequencing data, such as scRNASeq and scATACSeq (preferably the latter due to RNA editing issues). It includes functions for mutation calling and visualization. Mutation calling is done using beta-binomial distribution.
Maintained by Wenjie Sun. Last updated 3 months ago.
preprocessingsequencingsinglecellopenblascpp
12.2 match 2 stars 4.90 score 5 scriptsmartinzaefferer
CEGO:Combinatorial Efficient Global Optimization
Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces.
Maintained by Martin Zaefferer. Last updated 2 months ago.
19.4 match 1 stars 3.04 score 73 scriptsbioc
LACE:Longitudinal Analysis of Cancer Evolution (LACE)
LACE is an algorithmic framework that processes single-cell somatic mutation profiles from cancer samples collected at different time points and in distinct experimental settings, to produce longitudinal models of cancer evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a weighed likelihood function computed on multiple time points.
Maintained by Davide Maspero. Last updated 5 months ago.
biomedicalinformaticssinglecellsomaticmutation
7.7 match 15 stars 7.65 score 3 scriptsplotly
plotly:Create Interactive Web Graphics via 'plotly.js'
Create interactive web graphics from 'ggplot2' graphs and/or a custom interface to the (MIT-licensed) JavaScript library 'plotly.js' inspired by the grammar of graphics.
Maintained by Carson Sievert. Last updated 3 months ago.
d3jsdata-visualizationggplot2javascriptplotlyshinywebgl
3.0 match 2.6k stars 19.43 score 93k scripts 797 dependentsdreamrs
shinyWidgets:Custom Inputs Widgets for Shiny
Collection of custom input controls and user interface components for 'Shiny' applications. Give your applications a unique and colorful style !
Maintained by Victor Perrier. Last updated 11 days ago.
3.4 match 849 stars 17.05 score 8.1k scripts 218 dependentskharchenkolab
numbat:Haplotype-Aware CNV Analysis from scRNA-Seq
A computational method that infers copy number variations (CNVs) in cancer scRNA-seq data and reconstructs the tumor phylogeny. 'numbat' integrates signals from gene expression, allelic ratio, and population haplotype structures to accurately infer allele-specific CNVs in single cells and reconstruct their lineage relationship. 'numbat' can be used to: 1. detect allele-specific copy number variations from single-cells; 2. differentiate tumor versus normal cells in the tumor microenvironment; 3. infer the clonal architecture and evolutionary history of profiled tumors. 'numbat' does not require tumor/normal-paired DNA or genotype data, but operates solely on the donor scRNA-data data (for example, 10x Cell Ranger output). Additional examples and documentations are available at <https://kharchenkolab.github.io/numbat/>. For details on the method please see Gao et al. Nature Biotechnology (2022) <doi:10.1038/s41587-022-01468-y>.
Maintained by Teng Gao. Last updated 16 days ago.
cancer-genomicscnv-detectionlineage-tracingphylogenysingle-cellsingle-cell-analysissingle-cell-rna-seqspatial-transcriptomicscpp
7.8 match 179 stars 7.41 score 120 scriptslightbluetitan
OncoDataSets:A Comprehensive Collection of Cancer Types and Cancer-related DataSets
Offers a rich collection of data focused on cancer research, covering survival rates, genetic studies, biomarkers, and epidemiological insights. Designed for researchers, analysts, and bioinformatics practitioners, the package includes datasets on various cancer types such as melanoma, leukemia, breast, ovarian, and lung cancer, among others. It aims to facilitate advanced research, analysis, and understanding of cancer epidemiology, genetics, and treatment outcomes.
Maintained by Renzo Caceres Rossi. Last updated 3 months ago.
13.7 match 3 stars 4.18 score 6 scriptsnathaneastwood
poorman:A Poor Man's Dependency Free Recreation of 'dplyr'
A replication of key functionality from 'dplyr' and the wider 'tidyverse' using only 'base'.
Maintained by Nathan Eastwood. Last updated 1 years ago.
base-rdata-manipulationgrammar
5.3 match 341 stars 10.79 score 156 scripts 27 dependentsbioc
QSutils:Quasispecies Diversity
Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and explorationโfunctions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indicesโfunctions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulationโfunctions useful for generating random viral quasispecies data.
Maintained by Mercedes Guerrero-Murillo. Last updated 5 months ago.
softwaregeneticsdnaseqgeneticvariabilitysequencingalignmentsequencematchingdataimport
9.8 match 5.56 score 8 scripts 1 dependentsthackl
gggenomes:A Grammar of Graphics for Comparative Genomics
An extension of 'ggplot2' for creating complex genomic maps. It builds on the power of 'ggplot2' and 'tidyverse' adding new 'ggplot2'-style geoms & positions and 'dplyr'-style verbs to manipulate the underlying data. It implements a layout concept inspired by 'ggraph' and introduces tracks to bring tidiness to the mess that is genomics data.
Maintained by Thomas Hackl. Last updated 1 months ago.
biological-datacomparative-genomicsgenomics-visualizationggplot-extensionggplot2
5.6 match 650 stars 9.56 score 123 scriptsbioc
cardelino:Clone Identification from Single Cell Data
Methods to infer clonal tree configuration for a population of cells using single-cell RNA-seq data (scRNA-seq), and possibly other data modalities. Methods are also provided to assign cells to inferred clones and explore differences in gene expression between clones. These methods can flexibly integrate information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. A flexible beta-binomial error model that accounts for stochastic dropout events as well as systematic allelic imbalance is used.
Maintained by Davis McCarthy. Last updated 5 months ago.
singlecellrnaseqvisualizationtranscriptomicsgeneexpressionsequencingsoftwareexomeseqclonal-clusteringgibbs-samplingscrna-seqsingle-cellsomatic-mutations
7.5 match 61 stars 7.05 score 62 scriptsmikldk
malan:MAle Lineage ANalysis
MAle Lineage ANalysis by simulating genealogies backwards and imposing short tandem repeats (STR) mutations forwards. Intended for forensic Y chromosomal STR (Y-STR) haplotype analyses. Numerous analyses are possible, e.g. number of matches and meiotic distance to matches. Refer to papers mentioned in citation("malan") (DOI's: <doi:10.1371/journal.pgen.1007028>, <doi:10.21105/joss.00684> and <doi:10.1016/j.fsigen.2018.10.004>).
Maintained by Mikkel Meyer Andersen. Last updated 1 years ago.
11.8 match 4.48 score 6 scriptsbioc
TRONCO:TRONCO, an R package for TRanslational ONCOlogy
The TRONCO (TRanslational ONCOlogy) R package collects algorithms to infer progression models via the approach of Suppes-Bayes Causal Network, both from an ensemble of tumors (cross-sectional samples) and within an individual patient (multi-region or single-cell samples). The package provides parallel implementation of algorithms that process binary matrices where each row represents a tumor sample and each column a single-nucleotide or a structural variant driving the progression; a 0/1 value models the absence/presence of that alteration in the sample. The tool can import data from plain, MAF or GISTIC format files, and can fetch it from the cBioPortal for cancer genomics. Functions for data manipulation and visualization are provided, as well as functions to import/export such data to other bioinformatics tools for, e.g, clustering or detection of mutually exclusive alterations. Inferred models can be visualized and tested for their confidence via bootstrap and cross-validation. TRONCO is used for the implementation of the Pipeline for Cancer Inference (PICNIC).
Maintained by Luca De Sano. Last updated 5 months ago.
biomedicalinformaticsbayesiangraphandnetworksomaticmutationnetworkinferencenetworkclusteringdataimportsinglecellimmunooncologyalgorithmscancer-inferencetumors
8.0 match 30 stars 6.50 score 38 scriptsemmanuelparadis
pegas:Population and Evolutionary Genetics Analysis System
Functions for reading, writing, plotting, analysing, and manipulating allelic and haplotypic data, including from VCF files, and for the analysis of population nucleotide sequences and micro-satellites including coalescent analyses, linkage disequilibrium, population structure (Fst, Amova) and equilibrium (HWE), haplotype networks, minimum spanning tree and network, and median-joining networks.
Maintained by Emmanuel Paradis. Last updated 1 years ago.
6.9 match 7.53 score 576 scripts 18 dependentsrdinnager
slimr:Create, Run and Post-Process 'SLiM' Population Genetics Forward Simulations
Lets you write 'SLiM' scripts (population genomics simulation) using your favourite R IDE, using a syntax as close as possible to the original 'SLiM' language. It offer many tools to manipulate those scripts, as well as run them in the 'SLiM' software from R, as well as capture and post-process their output, after or even during a simulation.
Maintained by Russell Dinnage. Last updated 4 months ago.
11.0 match 8 stars 4.70 score 42 scriptsbioc
cbpManager:Generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics
This R package provides an R Shiny application that enables the user to generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics. Create cancer studies and edit its metadata. Upload mutation data of a patient that will be concatenated to the data_mutation_extended.txt file of the study. Create and edit clinical patient data, sample data, and timeline data. Create custom timeline tracks for patients.
Maintained by Arsenij Ustjanzew. Last updated 5 months ago.
immunooncologydataimportdatarepresentationguithirdpartyclientpreprocessingvisualizationcancer-genomicscbioportalclinical-datafilegeneratormutation-datapatient-data
9.3 match 8 stars 5.51 score 1 scriptsfcampelo
ExpDE:Modular Differential Evolution for Experimenting with Operators
Modular implementation of the Differential Evolution algorithm for experimenting with different types of operators.
Maintained by Felipe Campelo. Last updated 6 years ago.
13.9 match 2 stars 3.70 score 25 scriptsgaynorr
AlphaSimR:Breeding Program Simulations
The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) <doi:10.3835/plantgenome2016.02.0013>]. Used for stochastic simulations of breeding programs to the level of DNA sequence for every individual. Contained is a wide range of functions for modeling common tasks in a breeding program, such as selection and crossing. These functions allow for constructing simulations of highly complex plant and animal breeding programs via scripting in the R software environment. Such simulations can be used to evaluate overall breeding program performance and conduct research into breeding program design, such as implementation of genomic selection. Included is the 'Markovian Coalescent Simulator' ('MaCS') for fast simulation of biallelic sequences according to a population demographic history [Chen et al. (2009) <doi:10.1101/gr.083634.108>].
Maintained by Chris Gaynor. Last updated 5 months ago.
breedinggenomicssimulationopenblascppopenmp
5.0 match 47 stars 10.22 score 534 scripts 2 dependentsbioc
clusterProfiler:A universal enrichment tool for interpreting omics data
This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions.
Maintained by Guangchuang Yu. Last updated 4 months ago.
annotationclusteringgenesetenrichmentgokeggmultiplecomparisonpathwaysreactomevisualizationenrichment-analysisgsea
3.0 match 1.1k stars 17.03 score 11k scripts 48 dependentsddsjoberg
gtsummary:Presentation-Ready Data Summary and Analytic Result Tables
Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.
Maintained by Daniel D. Sjoberg. Last updated 2 days ago.
easy-to-usegthtml5regression-modelsreproducibilityreproducible-researchstatisticssummary-statisticssummary-tablestable1tableone
3.0 match 1.1k stars 17.00 score 8.2k scripts 15 dependentshanjunwei-lab
pathwayTMB:Pathway Based Tumor Mutational Burden
A systematic bioinformatics tool to develop a new pathway-based gene panel for tumor mutational burden (TMB) assessment (pathway-based tumor mutational burden, PTMB), using somatic mutations files in an efficient manner from either The Cancer Genome Atlas sources or any in-house studies as long as the data is in mutation annotation file (MAF) format. Besides, we develop a multiple machine learning method using the sample's PTMB profiles to identify cancer-specific dysfunction pathways, which can be a biomarker of prognostic and predictive for cancer immunotherapy.
Maintained by Junwei Han. Last updated 3 years ago.
20.2 match 2.48 score 2 scripts 1 dependentskassambara
ggpubr:'ggplot2' Based Publication Ready Plots
The 'ggplot2' package is excellent and flexible for elegant data visualization in R. However the default generated plots requires some formatting before we can send them for publication. Furthermore, to customize a 'ggplot', the syntax is opaque and this raises the level of difficulty for researchers with no advanced R programming skills. 'ggpubr' provides some easy-to-use functions for creating and customizing 'ggplot2'- based publication ready plots.
Maintained by Alboukadel Kassambara. Last updated 2 years ago.
3.0 match 1.2k stars 16.68 score 65k scripts 409 dependentsbioc
FLAMES:FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data
Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.
Maintained by Changqing Wang. Last updated 5 days ago.
rnaseqsinglecelltranscriptomicsdataimportdifferentialsplicingalternativesplicinggeneexpressionlongreadzlibcurlbzip2xz-utilscpp
6.2 match 31 stars 7.95 score 12 scriptsbioc
RESOLVE:RESOLVE: An R package for the efficient analysis of mutational signatures from cancer genomes
Cancer is a genetic disease caused by somatic mutations in genes controlling key biological functions such as cellular growth and division. Such mutations may arise both through cell-intrinsic and exogenous processes, generating characteristic mutational patterns over the genome named mutational signatures. The study of mutational signatures have become a standard component of modern genomics studies, since it can reveal which (environmental and endogenous) mutagenic processes are active in a tumor, and may highlight markers for therapeutic response. Mutational signatures computational analysis presents many pitfalls. First, the task of determining the number of signatures is very complex and depends on heuristics. Second, several signatures have no clear etiology, casting doubt on them being computational artifacts rather than due to mutagenic processes. Last, approaches for signatures assignment are greatly influenced by the set of signatures used for the analysis. To overcome these limitations, we developed RESOLVE (Robust EStimation Of mutationaL signatures Via rEgularization), a framework that allows the efficient extraction and assignment of mutational signatures. RESOLVE implements a novel algorithm that enables (i) the efficient extraction, (ii) exposure estimation, and (iii) confidence assessment during the computational inference of mutational signatures.
Maintained by Luca De Sano. Last updated 5 months ago.
biomedicalinformaticssomaticmutation
10.7 match 1 stars 4.60 score 3 scriptshojsgaard
doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities
Utility package containing: 1) Facilities for working with grouped data: 'do' something to data stratified 'by' some variables. 2) LSmeans (least-squares means), general linear estimates. 3) Restrict functions to a smaller domain. 4) Miscellaneous other utilities.
Maintained by Sรธren Hรธjsgaard. Last updated 4 days ago.
3.3 match 1 stars 14.94 score 3.2k scripts 939 dependentsbioc
supersigs:Supervised mutational signatures
Generate SuperSigs (supervised mutational signatures) from single nucleotide variants in the cancer genome. Functions included in the package allow the user to learn supervised mutational signatures from their data and apply them to new data. The methodology is based on the one described in Afsari (2021, ELife).
Maintained by Albert Kuo. Last updated 5 months ago.
featureextractionclassificationregressionsequencingwholegenomesomaticmutation
9.9 match 3 stars 4.78 score 3 scriptskharchenkolab
scistreer:Maximum-Likelihood Perfect Phylogeny Inference at Scale
Fast maximum-likelihood phylogeny inference from noisy single-cell data using the 'ScisTree' algorithm by Yufeng Wu (2019) <doi:10.1093/bioinformatics/btz676>. 'scistreer' provides an 'R' interface and improves speed via 'Rcpp' and 'RcppParallel', making the method applicable to massive single-cell datasets (>10,000 cells).
Maintained by Teng Gao. Last updated 2 years ago.
evolutionphylogeneticssingle-cellcpp
11.8 match 7 stars 4.02 score 2 scripts 1 dependentsasardaes
table.express:Build 'data.table' Expressions with Data Manipulation Verbs
A specialization of 'dplyr' data manipulation verbs that parse and build expressions which are ultimately evaluated by 'data.table', letting it handle all optimizations. A set of additional verbs is also provided to facilitate some common operations on a subset of the data.
Maintained by Alexis Sarda-Espinosa. Last updated 2 years ago.
8.0 match 65 stars 5.81 score 8 scriptsbioc
SynMut:SynMut: Designing Synonymously Mutated Sequences with Different Genomic Signatures
There are increasing demands on designing virus mutants with specific dinucleotide or codon composition. This tool can take both dinucleotide preference and/or codon usage bias into account while designing mutants. It is a powerful tool for in silico designs of DNA sequence mutants.
Maintained by Haogao Gu. Last updated 5 months ago.
sequencematchingexperimentaldesignpreprocessing
10.8 match 2 stars 4.30 score 1 scriptsbioc
CIMICE:CIMICE-R: (Markov) Chain Method to Inferr Cancer Evolution
CIMICE is a tool in the field of tumor phylogenetics and its goal is to build a Markov Chain (called Cancer Progression Markov Chain, CPMC) in order to model tumor subtypes evolution. The input of CIMICE is a Mutational Matrix, so a boolean matrix representing altered genes in a collection of samples. These samples are assumed to be obtained with single-cell DNA analysis techniques and the tool is specifically written to use the peculiarities of this data for the CMPC construction.
Maintained by Nicolรฒ Rossi. Last updated 5 months ago.
softwarebiologicalquestionnetworkinferenceresearchfieldphylogeneticsstatisticalmethodgraphandnetworktechnologysinglecell
10.8 match 4.30 score 5 scriptskassambara
rstatix:Pipe-Friendly Framework for Basic Statistical Tests
Provides a simple and intuitive pipe-friendly framework, coherent with the 'tidyverse' design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses. The output of each test is automatically transformed into a tidy data frame to facilitate visualization. Additional functions are available for reshaping, reordering, manipulating and visualizing correlation matrix. Functions are also included to facilitate the analysis of factorial experiments, including purely 'within-Ss' designs (repeated measures), purely 'between-Ss' designs, and mixed 'within-and-between-Ss' designs. It's also possible to compute several effect size metrics, including "eta squared" for ANOVA, "Cohen's d" for t-test and 'Cramer V' for the association between categorical variables. The package contains helper functions for identifying univariate and multivariate outliers, assessing normality and homogeneity of variances.
Maintained by Alboukadel Kassambara. Last updated 2 years ago.
3.0 match 456 stars 15.16 score 11k scripts 420 dependentscynkra
dm:Relational Data Models
Provides tools for working with multiple related tables, stored as data frames or in a relational database. Multiple tables (data and metadata) are stored in a compound object, which can then be manipulated with a pipe-friendly syntax.
Maintained by Kirill Mรผller. Last updated 2 months ago.
data-modeldata-warehousingdatawarehousingdbidbplyrrelational-databases
3.0 match 511 stars 14.81 score 410 scripts 8 dependentsthomasp85
tidygraph:A Tidy API for Graph Manipulation
A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.
Maintained by Thomas Lin Pedersen. Last updated 1 months ago.
graph-algorithmsgraph-manipulationigraphnetwork-analysistidyversecpp
3.0 match 553 stars 14.74 score 4.6k scripts 136 dependentsskranz
dplyrExtras:extra functionality for dplyr like mutate_rows for mutation of a subset of rows
Some extra functionality that is not (yet) in dplyr, e.g. mutate_rows (mutation of subset of rows) xsummarise_each (summarise_each with more flexible alignment of results), or s_filter, s_arrange ,... that allow string arguments.
Maintained by Sebastian Kranz. Last updated 5 years ago.
9.1 match 20 stars 4.85 score 59 scripts 4 dependentsbioc
plasmut:Stratifying mutations observed in cell-free DNA and white blood cells as germline, hematopoietic, or somatic
A Bayesian method for quantifying the liklihood that a given plasma mutation arises from clonal hematopoesis or the underlying tumor. It requires sequencing data of the mutation in plasma and white blood cells with the number of distinct and mutant reads in both tissues. We implement a Monte Carlo importance sampling method to assess the likelihood that a mutation arises from the tumor relative to non-tumor origin.
Maintained by Adith Arun. Last updated 5 months ago.
bayesiansomaticmutationgermlinemutationsequencing
10.9 match 4.00 score 2 scriptsbioc
compSPOT:compSPOT: Tool for identifying and comparing significantly mutated genomic hotspots
Clonal cell groups share common mutations within cancer, precancer, and even clinically normal appearing tissues. The frequency and location of these mutations may predict prognosis and cancer risk. It has also been well established that certain genomic regions have increased sensitivity to acquiring mutations. Mutation-sensitive genomic regions may therefore serve as markers for predicting cancer risk. This package contains multiple functions to establish significantly mutated hotspots, compare hotspot mutation burden between samples, and perform exploratory data analysis of the correlation between hotspot mutation burden and personal risk factors for cancer, such as age, gender, and history of carcinogen exposure. This package allows users to identify robust genomic markers to help establish cancer risk.
Maintained by Sydney Grant. Last updated 5 months ago.
softwaretechnologysequencingdnaseqwholegenomeclassificationsinglecellsurvivalmultiplecomparison
10.7 match 4.00 score 3 scriptsluca-scr
GA:Genetic Algorithms
Flexible general-purpose toolbox implementing genetic algorithms (GAs) for stochastic optimisation. Binary, real-valued, and permutation representations are available to optimize a fitness function, i.e. a function provided by users depending on their objective function. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. Local search using general-purpose optimisation algorithms can be applied stochastically to exploit interesting regions. GAs can be run sequentially or in parallel, using an explicit master-slave parallelisation or a coarse-grain islands approach. For more details see Scrucca (2013) <doi:10.18637/jss.v053.i04> and Scrucca (2017) <doi:10.32614/RJ-2017-008>.
Maintained by Luca Scrucca. Last updated 6 months ago.
genetic-algorithmoptimisationcpp
3.7 match 93 stars 11.58 score 624 scripts 52 dependentsgergness
srvyr:'dplyr'-Like Syntax for Summary Statistics of Survey Data
Use piping, verbs like 'group_by' and 'summarize', and other 'dplyr' inspired syntactic style when calculating summary statistics on survey data using functions from the 'survey' package.
Maintained by Greg Freedman Ellis. Last updated 1 months ago.
3.0 match 215 stars 13.88 score 1.8k scripts 15 dependentsjakobbossek
mcMST:A Toolbox for the Multi-Criteria Minimum Spanning Tree Problem
Algorithms to approximate the Pareto-front of multi-criteria minimum spanning tree problems.
Maintained by Jakob Bossek. Last updated 2 years ago.
evolutionary-algorithmsmcmstminimum-spanning-treesmulti-objective-optimizationspanningtrees
8.7 match 4 stars 4.73 score 27 scriptsbioc
SUITOR:Selecting the number of mutational signatures through cross-validation
An unsupervised cross-validation method to select the optimal number of mutational signatures. A data set of mutational counts is split into training and validation data.Signatures are estimated in the training data and then used to predict the mutations in the validation data.
Maintained by Bill Wheeler. Last updated 5 months ago.
geneticssoftwaresomaticmutation
10.1 match 4.00 score 2 scriptsstatgenlmu
coala:A Framework for Coalescent Simulation
Coalescent simulators can rapidly simulate biological sequences evolving according to a given model of evolution. You can use this package to specify such models, to conduct the simulations and to calculate additional statistics from the results (Staab, Metzler, 2016 <doi:10.1093/bioinformatics/btw098>). It relies on existing simulators for doing the simulation, and currently supports the programs 'ms', 'msms' and 'scrm'. It also supports finite-sites mutation models by combining the simulators with the program 'seq-gen'. Coala provides functions for calculating certain summary statistics, which can also be applied to actual biological data. One possibility to import data is through the 'PopGenome' package (<https://github.com/pievos101/PopGenome>).
Maintained by Dirk Metzler. Last updated 1 years ago.
coalescentdnaevolutionpopgensimulationcpp
5.7 match 23 stars 7.06 score 84 scriptsbioc
survClust:Identification Of Clinically Relevant Genomic Subtypes Using Outcome Weighted Learning
survClust is an outcome weighted integrative clustering algorithm used to classify multi-omic samples on their available time to event information. The resulting clusters are cross-validated to avoid over overfitting and output classification of samples that are molecularly distinct and clinically meaningful. It takes in binary (mutation) as well as continuous data (other omic types).
Maintained by Arshi Arora. Last updated 5 months ago.
softwareclusteringsurvivalclassificationcpp
8.5 match 16 stars 4.74 score 17 scriptsyulab-smu
tidytree:A Tidy Tool for Phylogenetic Tree Data Manipulation
Phylogenetic tree generally contains multiple components including node, edge, branch and associated data. 'tidytree' provides an approach to convert tree object to tidy data frame as well as provides tidy interfaces to manipulate tree data.
Maintained by Guangchuang Yu. Last updated 8 months ago.
phylogenetic-treetidyversetree-data
3.0 match 54 stars 13.25 score 584 scripts 128 dependentsbioc
tidySpatialExperiment:SpatialExperiment with tidy principles
tidySpatialExperiment provides a bridge between the SpatialExperiment package and the tidyverse ecosystem. It creates an invisible layer that allows you to interact with a SpatialExperiment object as if it were a tibble; enabling the use of functions from dplyr, tidyr, ggplot2 and plotly. But, underneath, your data remains a SpatialExperiment object.
Maintained by William Hutchison. Last updated 5 months ago.
infrastructurernaseqgeneexpressionsequencingspatialtranscriptomicssinglecell
6.8 match 6 stars 5.88 score 12 scriptshanjunwei-lab
ProgModule:Identification of Prognosis-Related Mutually Exclusive Modules
A novel tool to identify candidate driver modules for predicting the prognosis of patients by integrating exclusive coverage of mutations with clinical characteristics in cancer.
Maintained by Junwei Han. Last updated 3 months ago.
10.3 match 3.70 score 1 scriptsbioc
plyranges:A fluent interface for manipulating GenomicRanges
A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.
Maintained by Michael Love. Last updated 5 months ago.
infrastructuredatarepresentationworkflowstepcoveragebioconductordata-analysisdplyrgenomic-rangesgenomicstidy-data
3.0 match 143 stars 12.60 score 1.9k scripts 20 dependentsropensci
skimr:Compact and Flexible Summaries of Data
A simple to use summary function that can be used with pipes and displays nicely in the console. The default summary statistics may be modified by the user as can the default formatting. Support for data frames and vectors is included, and users can implement their own skim methods for specific object types as described in a vignette. Default summaries include support for inline spark graphs. Instructions for managing these on specific operating systems are given in the "Using skimr" vignette and the README.
Maintained by Elin Waring. Last updated 2 months ago.
peer-reviewedropenscisummary-statisticsunconfunconf17
2.3 match 1.1k stars 16.80 score 18k scripts 14 dependentsdanymukesha
genetic.algo.optimizeR:Genetic Algorithm Optimization
Genetic algorithm are a class of optimization algorithms inspired by the process of natural selection and genetics. This package is for learning purposes and allows users to optimize various functions or parameters by mimicking biological evolution processes such as selection, crossover, and mutation. Ideal for tasks like machine learning parameter tuning, mathematical function optimization, and solving an optimization problem that involves finding the best solution in a discrete space.
Maintained by Dany Mukesha. Last updated 5 months ago.
8.8 match 4.30 score 10 scriptscomplexgenome
GARCOM:Gene and Region Counting of Mutations ("GARCOM")
Gene and Region Counting of Mutations (GARCOM) package computes mutation (or alleles) counts per gene per individuals based on gene annotation or genomic base pair boundaries. It comes with features to accept data formats in plink(.raw) and VCF. It provides users flexibility to extract and filter individuals, mutations and genes of interest.
Maintained by Sanjeev Sariya. Last updated 2 years ago.
13.9 match 2.70 score 2 scriptsbioc
goProfiles:goProfiles: an R package for the statistical analysis of functional profiles
The package implements methods to compare lists of genes based on comparing the corresponding 'functional profiles'.
Maintained by Alex Sanchez. Last updated 5 months ago.
annotationgogeneexpressiongenesetenrichmentgraphandnetworkmicroarraymultiplecomparisonpathwayssoftware
6.9 match 5.48 score 6 scripts 1 dependentsbioc
mslp:Predict synthetic lethal partners of tumour mutations
An integrated pipeline to predict the potential synthetic lethality partners (SLPs) of tumour mutations, based on gene expression, mutation profiling and cell line genetic screens data. It has builtd-in support for data from cBioPortal. The primary SLPs correlating with muations in WT and compensating for the loss of function of mutations are predicted by random forest based methods (GENIE3) and Rank Products, respectively. Genetic screens are employed to identfy consensus SLPs leads to reduced cell viability when perturbed.
Maintained by Chunxuan Shao. Last updated 5 months ago.
pharmacogeneticspharmacogenomics
11.3 match 3.30 score 1 scriptsbioc
GenomicDataCommons:NIH / NCI Genomic Data Commons Access
Programmatically access the NIH / NCI Genomic Data Commons RESTful service.
Maintained by Sean Davis. Last updated 1 months ago.
dataimportsequencingapi-clientbioconductorbioinformaticscancercore-servicesdata-sciencegenomicsncitcgavignette
3.1 match 87 stars 11.94 score 238 scripts 12 dependentscobrbra
ICBioMark:Data-Driven Design of Targeted Gene Panels for Estimating Immunotherapy Biomarkers
Implementation of the methodology proposed in 'Data-driven design of targeted gene panels for estimating immunotherapy biomarkers', Bradley and Cannings (2021) <arXiv:2102.04296>. This package allows the user to fit generative models of mutation from an annotated mutation dataset, and then further to produce tunable linear estimators of exome-wide biomarkers. It also contains functions to simulate mutation annotated format (MAF) data, as well as to analyse the output and performance of models.
Maintained by Jacob R. Bradley. Last updated 2 years ago.
13.8 match 2.70 score 2 scriptsbioc
CrispRVariants:Tools for counting and visualising mutations in a target location
CrispRVariants provides tools for analysing the results of a CRISPR-Cas9 mutagenesis sequencing experiment, or other sequencing experiments where variants within a given region are of interest. These tools allow users to localize variant allele combinations with respect to any genomic location (e.g. the Cas9 cut site), plot allele combinations and calculate mutation rates with flexible filtering of unrelated variants.
Maintained by Helen Lindsay. Last updated 5 months ago.
immunooncologycrisprgenomicvariationvariantdetectiongeneticvariabilitydatarepresentationvisualizationsequencing
6.8 match 5.51 score 32 scriptsbioc
GraphPAC:Identification of Mutational Clusters in Proteins via a Graph Theoretical Approach.
Identifies mutational clusters of amino acids in a protein while utilizing the proteins tertiary structure via a graph theoretical model.
Maintained by Gregory Ryslik. Last updated 2 days ago.
8.0 match 4.65 score 1 scripts 1 dependentsr-spatial
stars:Spatiotemporal Arrays, Raster and Vector Data Cubes
Reading, manipulating, writing and plotting spatiotemporal arrays (raster and vector data cubes) in 'R', using 'GDAL' bindings provided by 'sf', and 'NetCDF' bindings by 'ncmeta' and 'RNetCDF'.
Maintained by Edzer Pebesma. Last updated 30 days ago.
2.0 match 568 stars 18.26 score 7.2k scripts 135 dependentsbioc
SpacePAC:Identification of Mutational Clusters in 3D Protein Space via Simulation.
Identifies clustering of somatic mutations in proteins via a simulation approach while considering the protein's tertiary structure.
Maintained by Gregory Ryslik. Last updated 2 days ago.
7.8 match 4.65 score 2 scripts 1 dependentsbioc
QuartPAC:Identification of mutational clusters in protein quaternary structures
Identifies clustering of somatic mutations in proteins over the entire quaternary structure.
Maintained by Gregory Ryslik. Last updated 2 days ago.
clusteringproteomicssomaticmutation
10.0 match 3.60 score 2 scriptsgreen-striped-gecko
PopGenReport:A Simple Framework to Analyse Population and Landscape Genetic Data
Provides beginner friendly framework to analyse population genetic data. Based on 'adegenet' objects it uses 'knitr' to create comprehensive reports on spatial genetic data. For detailed information how to use the package refer to the comprehensive tutorials or visit <http://www.popgenreport.org/>.
Maintained by Bernd Gruber. Last updated 1 years ago.
4.8 match 5 stars 7.27 score 82 scripts 1 dependentshanjunwei-lab
SMDIC:Identification of Somatic Mutation-Driven Immune Cells
A computing tool is developed to automated identify somatic mutation-driven immune cells. The operation modes including: i) inferring the relative abundance matrix of tumor-infiltrating immune cells and integrating it with a particular gene mutation status, ii) detecting differential immune cells with respect to the gene mutation status and converting the abundance matrix of significant differential immune cell into two binary matrices (one for up-regulated and one for down-regulated), iii) identifying somatic mutation-driven immune cells by comparing the gene mutation status with each immune cell in the binary matrices across all samples, and iv) visualization of immune cell abundance of samples in different mutation status..
Maintained by Junwei Han. Last updated 5 months ago.
8.6 match 2 stars 4.00 score 5 scriptsjthomasmock
gtExtras:Extending 'gt' for Beautiful HTML Tables
Provides additional functions for creating beautiful tables with 'gt'. The functions are generally wrappers around boilerplate or adding opinionated niche capabilities and helpers functions.
Maintained by Thomas Mock. Last updated 12 months ago.
data-sciencedata-visualizationdatascienceggplot2gtplotssparklinesparkline-graphssparklinestables
3.0 match 199 stars 11.45 score 2.4k scripts 3 dependentsmarkfairbanks
tidytable:Tidy Interface to 'data.table'
A tidy interface to 'data.table', giving users the speed of 'data.table' while using tidyverse-like syntax.
Maintained by Mark Fairbanks. Last updated 2 months ago.
3.0 match 458 stars 11.41 score 732 scripts 10 dependentsrstudio
promises:Abstractions for Promise-Based Asynchronous Programming
Provides fundamental abstractions for doing asynchronous programming in R using promises. Asynchronous programming is useful for allowing a single R process to orchestrate multiple tasks in the background while also attending to something else. Semantics are similar to 'JavaScript' promises, but with a syntax that is idiomatic R.
Maintained by Joe Cheng. Last updated 1 months ago.
2.0 match 204 stars 17.10 score 688 scripts 2.6k dependentsropensci
BaseSet:Working with Sets the Tidy Way
Implements a class and methods to work with sets, doing intersection, union, complementary sets, power sets, cartesian product and other set operations in a "tidy" way. These set operations are available for both classical sets and fuzzy sets. Import sets from several formats or from other several data structures.
Maintained by Lluรญs Revilla Sancho. Last updated 25 days ago.
bioconductorbioconductor-packagesets
6.0 match 11 stars 5.69 score 5 scriptstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 5 days ago.
1.8 match 584 stars 18.71 score 7.2k scripts 380 dependentsbioc
RCy3:Functions to Access and Control Cytoscape
Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.
Maintained by Alex Pico. Last updated 5 months ago.
visualizationgraphandnetworkthirdpartyclientnetwork
2.5 match 52 stars 13.39 score 628 scripts 15 dependentsuupharmacometrics
xpose:Diagnostics for Pharmacometric Models
Diagnostics for non-linear mixed-effects (population) models from 'NONMEM' <https://www.iconplc.com/solutions/technologies/nonmem/>. 'xpose' facilitates data import, creation of numerical run summary and provide 'ggplot2'-based graphics for data exploration and model diagnostics.
Maintained by Benjamin Guiastrennec. Last updated 2 months ago.
diagnosticsggplot2nonmempharmacometricsxpose
3.0 match 62 stars 11.02 score 183 scripts 6 dependentstidyverse
stringr:Simple, Consistent Wrappers for Common String Operations
A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.
Maintained by Hadley Wickham. Last updated 7 months ago.
1.5 match 622 stars 21.97 score 164k scripts 8.2k dependentsmitchelloharawild
vitae:Curriculum Vitae for R Markdown
Provides templates and functions to simplify the production and maintenance of curriculum vitae.
Maintained by Mitchell OHara-Wild. Last updated 9 months ago.
3.0 match 1.2k stars 10.78 score 556 scriptsbodkan
slendr:A Simulation Framework for Spatiotemporal Population Genetics
A framework for simulating spatially explicit genomic data which leverages real cartographic information for programmatic and visual encoding of spatiotemporal population dynamics on real geographic landscapes. Population genetic models are then automatically executed by the 'SLiM' software by Haller et al. (2019) <doi:10.1093/molbev/msy228> behind the scenes, using a custom built-in simulation 'SLiM' script. Additionally, fully abstract spatial models not tied to a specific geographic location are supported, and users can also simulate data from standard, non-spatial, random-mating models. These can be simulated either with the 'SLiM' built-in back-end script, or using an efficient coalescent population genetics simulator 'msprime' by Baumdicker et al. (2022) <doi:10.1093/genetics/iyab229> with a custom-built 'Python' script bundled with the R package. Simulated genomic data is saved in a tree-sequence format and can be loaded, manipulated, and summarised using tree-sequence functionality via an R interface to the 'Python' module 'tskit' by Kelleher et al. (2019) <doi:10.1038/s41588-019-0483-y>. Complete model configuration, simulation and analysis pipelines can be therefore constructed without a need to leave the R environment, eliminating friction between disparate tools for population genetic simulations and data analysis.
Maintained by Martin Petr. Last updated 12 days ago.
popgenpopulation-geneticssimulationsspatial-statistics
3.5 match 56 stars 9.15 score 88 scriptsjmsigner
amt:Animal Movement Tools
Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.
Maintained by Johannes Signer. Last updated 4 months ago.
3.0 match 41 stars 10.54 score 418 scriptscran
ClusteredMutations:Location and Visualization of Clustered Somatic Mutations
Identification and visualization of groups of closely spaced mutations in the DNA sequence of cancer genome. The extremely mutated zones are searched in the symmetric dissimilarity matrix using the anti-Robinson matrix properties. Different data sets are obtained to describe and plot the clustered mutations information.
Maintained by David Lora. Last updated 9 years ago.
15.8 match 2.00 scorehanjunwei-lab
PMAPscore:Identify Prognosis-Related Pathways Altered by Somatic Mutation
We innovatively defined a pathway mutation accumulate perturbation score (PMAPscore) to reflect the position and the cumulative effect of the genetic mutations at the pathway level. Based on the PMAPscore of pathways, identified prognosis-related pathways altered by somatic mutation and predict immunotherapy efficacy by constructing a multiple-pathway-based risk model (Tarca, Adi Laurentiu et al (2008) <doi:10.1093/bioinformatics/btn577>).
Maintained by Junwei Han. Last updated 3 years ago.
8.5 match 3.70 score 2 scriptsbcgov
bcdata:Search and Retrieve Data from the BC Data Catalogue
Search, query, and download tabular and 'geospatial' data from the British Columbia Data Catalogue (<https://catalogue.data.gov.bc.ca/>). Search catalogue data records based on keywords, data licence, sector, data format, and B.C. government organization. View metadata directly in R, download many data formats, and query 'geospatial' data available via the B.C. government Web Feature Service ('WFS') using 'dplyr' syntax.
Maintained by Andy Teucher. Last updated 1 months ago.
3.0 match 83 stars 10.29 score 186 scripts 4 dependentspik-piam
quitte:Bits and pieces of code to use with quitte-style data frames
A collection of functions for easily dealing with quitte-style data frames, doing multi-model comparisons and plots.
Maintained by Michaja Pehl. Last updated 2 days ago.
3.8 match 8.22 score 184 scripts 35 dependentscynkra
munch:Functions for working with the historicized list of communes of Switzerland
Contains historicized municipality data for Switzerland from 1960 onwards, from the "Historisiertes Gemeindeverzeichnis" of the Swiss Federal Statistical Office.
Maintained by Kirill Mรผller. Last updated 3 months ago.
5.6 match 6 stars 5.43 score 2 scriptshojsgaard
gRbase:A Package for Graphical Modelling in R
The 'gRbase' package provides graphical modelling features used by e.g. the packages 'gRain', 'gRim' and 'gRc'. 'gRbase' implements graph algorithms including (i) maximum cardinality search (for marked and unmarked graphs). (ii) moralization, (iii) triangulation, (iv) creation of junction tree. 'gRbase' facilitates array operations, 'gRbase' implements functions for testing for conditional independence. 'gRbase' illustrates how hierarchical log-linear models may be implemented and describes concept of graphical meta data. The facilities of the package are documented in the book by Hรธjsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>) and in the paper by Dethlefsen and Hรธjsgaard, (2005, <doi:10.18637/jss.v014.i17>). Please see 'citation("gRbase")' for citation details.
Maintained by Sรธren Hรธjsgaard. Last updated 4 months ago.
3.3 match 3 stars 9.24 score 241 scripts 20 dependentskarissawhiting
cbioportalR:Browse and Query Clinical and Genomic Data from cBioPortal
Provides R users with direct access to genomic and clinical data from the 'cBioPortal' web resource via user-friendly functions that wrap 'cBioPortal's' existing API endpoints <https://www.cbioportal.org/api/swagger-ui/index.html>. Users can browse and query genomic data on mutations, copy number alterations and fusions, as well as data on tumor mutational burden ('TMB'), microsatellite instability status ('MSI'), 'FACETS' and select clinical data points (depending on the study). See <https://www.cbioportal.org/> and Gao et al., (2013) <doi:10.1126/scisignal.2004088> for more information on the cBioPortal web resource.
Maintained by Karissa Whiting. Last updated 4 months ago.
4.5 match 21 stars 6.70 score 20 scriptsbioc
plyinteractions:Extending tidy verbs to genomic interactions
Operate on `GInteractions` objects as tabular data using `dplyr`-like verbs. The functions and methods in `plyinteractions` provide a grammatical approach to manipulate `GInteractions`, to facilitate their integration in genomic analysis workflows.
Maintained by Jacques Serizay. Last updated 5 months ago.
6.4 match 4.75 score 14 scriptsjprybylski
xpose.xtras:Extra Functionality for the 'xpose' Package
Adding some at-present missing functionality, or functions unlikely to be added to the base 'xpose' package. This includes some diagnostic plots that have been missing in translation from 'xpose4', but also some useful features that truly extend the capabilities of what can be done with 'xpose'. These extensions include the concept of a set of 'xpose' objects, and diagnostics for likelihood-based models.
Maintained by John Prybylski. Last updated 4 months ago.
5.0 match 6.01 score 5 scriptsjonesor
Rcompadre:Utilities for using the 'COM(P)ADRE' Matrix Model Database
Utility functions for interacting with the 'COMPADRE' and 'COMADRE' databases of matrix population models. Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.
Maintained by Owen Jones. Last updated 5 months ago.
3.9 match 11 stars 7.74 score 55 scripts 2 dependentsr-lib
lintr:A 'Linter' for R Code
Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.
Maintained by Michael Chirico. Last updated 8 days ago.
1.8 match 1.2k stars 17.00 score 916 scripts 33 dependentsbioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
3.0 match 183 stars 9.70 score 126 scripts 1 dependentsshixiangwang
sigminer.prediction:Train and Predict Cancer Subtype with Keras Model based on Mutational Signatures
Mutational signatures represent mutational processes occured in cancer evolution, thus are stable and genetic resources for subtyping. This tool provides functions for training neutral network models to predict the subtype a sample belongs to based on 'keras' and 'sigminer' packages.
Maintained by Shixiang Wang. Last updated 3 years ago.
kerasmutational-signaturesprostate-cancersigminer
11.1 match 8 stars 2.60 score 2 scriptspierreroudier
spectacles:Storing, Manipulating and Analysis Spectroscopy and Associated Data
Stores and eases the manipulation of spectra and associated data, with dedicated classes for spatial and soil-related data.
Maintained by Pierre Roudier. Last updated 2 years ago.
4.6 match 11 stars 6.17 score 45 scripts 1 dependentsbioc
tidybulk:Brings transcriptomics to the tidyverse
This is a collection of utility functions that allow to perform exploration of and calculations to RNA sequencing data, in a modular, pipe-friendly and tidy fashion.
Maintained by Stefano Mangiola. Last updated 5 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsbioconductorbulk-transcriptional-analysesdeseq2differential-expressionedgerensembl-idsentrezgene-symbolsgseamds-dimensionspcapiperedundancytibbletidytidy-datatidyversetranscriptstsne
3.0 match 168 stars 9.48 score 172 scripts 1 dependentsfcampelo
MOEADr:Component-Wise MOEA/D Implementation
Modular implementation of Multiobjective Evolutionary Algorithms based on Decomposition (MOEA/D) [Zhang and Li (2007), <DOI:10.1109/TEVC.2007.892759>] for quick assembling and testing of new algorithmic components, as well as easy replication of published MOEA/D proposals. The full framework is documented in a paper published in the Journal of Statistical Software [<doi:10.18637/jss.v092.i06>].
Maintained by Felipe Campelo. Last updated 2 years ago.
moeadmultiobjective-optimization
4.5 match 20 stars 6.30 score 40 scriptsbioc
MesKit:A tool kit for dissecting cancer evolution from multi-region derived tumor biopsies via somatic alterations
MesKit provides commonly used analysis and visualization modules based on mutational data generated by multi-region sequencing (MRS). This package allows to depict mutational profiles, measure heterogeneity within or between tumors from the same patient, track evolutionary dynamics, as well as characterize mutational patterns on different levels. Shiny application was also developed for a need of GUI-based analysis. As a handy tool, MesKit can facilitate the interpretation of tumor heterogeneity and the understanding of evolutionary relationship between regions in MRS study.
Maintained by Mengni Liu. Last updated 5 months ago.
5.9 match 4.73 score 18 scripts 1 dependentsgreymonroe
genemodel:Gene Model Plotting in R
Using simple input, this package creates plots of gene models. Users can create plots of alternatively spliced gene variants and the positions of mutations and other gene features.
Maintained by J Grey Monroe. Last updated 8 years ago.
6.4 match 4 stars 4.30 score 9 scriptsbupaverse
bupaR:Business Process Analysis in R
Comprehensive Business Process Analysis toolkit. Creates S3-class for event log objects, and related handler functions. Imports related packages for filtering event data, computation of descriptive statistics, handling of 'Petri Net' objects and visualization of process maps. See also packages 'edeaR','processmapR', 'eventdataR' and 'processmonitR'.
Maintained by Gert Janssenswillen. Last updated 2 years ago.
3.0 match 55 stars 9.07 score 389 scripts 11 dependentsbioc
GenVisR:Genomic Visualizations in R
Produce highly customizable publication quality graphics for genomic data primarily at the cohort level.
Maintained by Zachary Skidmore. Last updated 5 months ago.
infrastructuredatarepresentationclassificationdnaseq
2.8 match 215 stars 9.87 score 76 scriptscmmr
rbiom:Read/Write, Analyze, and Visualize 'BIOM' Data
A toolkit for working with Biological Observation Matrix ('BIOM') files. Read/write all 'BIOM' formats. Compute rarefaction, alpha diversity, and beta diversity (including 'UniFrac'). Summarize counts by taxonomic level. Subset based on metadata. Generate visualizations and statistical analyses. CPU intensive operations are coded in C for speed.
Maintained by Daniel P. Smith. Last updated 6 days ago.
3.0 match 15 stars 9.02 score 117 scripts 6 dependentsysosirius
windfarmGA:Genetic Algorithm for Wind Farm Layout Optimization
The genetic algorithm is designed to optimize wind farms of any shape. It requires a predefined amount of turbines, a unified rotor radius and an average wind speed value for each incoming wind direction. A terrain effect model can be included that downloads an 'SRTM' elevation model and loads a Corine Land Cover raster to approximate surface roughness.
Maintained by Sebastian Gatscha. Last updated 2 months ago.
windfarm-layoutoptimizationgenetic-algorithmrenewable-energycpp
5.3 match 27 stars 5.06 score 17 scriptsbusiness-science
timetk:A Tool Kit for Working with Time Series
Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.
Maintained by Matt Dancho. Last updated 1 years ago.
coercioncoercion-functionsdata-miningdplyrforecastforecastingforecasting-modelsmachine-learningseries-decompositionseries-signaturetibbletidytidyquanttidyversetimetime-seriestimeseries
1.9 match 625 stars 14.15 score 4.0k scripts 16 dependentsadibender
pammtools:Piece-Wise Exponential Additive Mixed Modeling Tools for Survival Analysis
The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated, competing risks and recurrent events data. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.
Maintained by Andreas Bender. Last updated 2 months ago.
additive-modelspammpammtoolspiece-wise-exponentialsurvival-analysis
3.0 match 48 stars 8.78 score 310 scripts 8 dependentsbioc
OncoSimulR:Forward Genetic Simulation of Cancer Progression with Epistasis
Functions for forward population genetic simulation in asexual populations, with special focus on cancer progression. Fitness can be an arbitrary function of genetic interactions between multiple genes or modules of genes, including epistasis, order restrictions in mutation accumulation, and order effects. Fitness (including just birth, just death, or both birth and death) can also be a function of the relative and absolute frequencies of other genotypes (i.e., frequency-dependent fitness). Mutation rates can differ between genes, and we can include mutator/antimutator genes (to model mutator phenotypes). Simulating multi-species scenarios and therapeutic interventions, including adaptive therapy, is also possible. Simulations use continuous-time models and can include driver and passenger genes and modules. Also included are functions for: simulating random DAGs of the type found in Oncogenetic Trees, Conjunctive Bayesian Networks, and other cancer progression models; plotting and sampling from single or multiple realizations of the simulations, including single-cell sampling; plotting the parent-child relationships of the clones; generating random fitness landscapes (Rough Mount Fuji, House of Cards, additive, NK, Ising, and Eggbox models) and plotting them.
Maintained by Ramon Diaz-Uriarte. Last updated 11 days ago.
biologicalquestionsomaticmutationcpp
4.3 match 7 stars 6.06 score 68 scriptsr-tidy-remote-sensing
tidyrgee:'tidyverse' Methods for 'Earth Engine'
Provides 'tidyverse' methods for wrangling and analyzing 'Earth Engine' <https://earthengine.google.com/> data. These methods help the user with filtering, joining and summarising 'Earth Engine' image collections.
Maintained by Zack Arno. Last updated 2 years ago.
4.7 match 48 stars 5.53 score 140 scriptsrozen-lab
mSigTools:Mutational Signature Analysis Tools
Utility functions for mutational signature analysis as described in Alexandrov, L. B. (2020) <doi:10.1038/s41586-020-1943-3>. This package provides two groups of functions. One is for dealing with mutational signature "exposures" (i.e. the counts of mutations in a sample that are due to each mutational signature). The other group of functions is for matching or comparing sets of mutational signatures. 'mSigTools' stands for mutational Signature analysis Tools.
Maintained by Steven Rozen. Last updated 2 years ago.
8.5 match 2 stars 3.00 score 9 scriptssapfluxnet
sapfluxnetr:Working with 'Sapfluxnet' Project Data
Access, modify, aggregate and plot data from the 'Sapfluxnet' project (<http://sapfluxnet.creaf.cat>), the first global database of sap flow measurements.
Maintained by Victor Granda. Last updated 2 years ago.
3.9 match 25 stars 6.57 score 49 scriptslangendorfr
netcom:NETwork COMparison Inference
Infer system functioning with empirical NETwork COMparisons. These methods are part of a growing paradigm in network science that uses relative comparisons of networks to infer mechanistic classifications and predict systemic interventions. They have been developed and applied in Langendorf and Burgess (2021) <doi:10.1038/s41598-021-99251-7>, Langendorf (2020) <doi:10.1201/9781351190831-6>, and Langendorf and Goldberg (2019) <arXiv:1912.12551>.
Maintained by Ryan Langendorf. Last updated 8 months ago.
5.6 match 5 stars 4.46 score 115 scriptsmatthewwolak
nadiv:(Non)Additive Genetic Relatedness Matrices
Constructs (non)additive genetic relationship matrices, and their inverses, from a pedigree to be used in linear mixed effect models (A.K.A. the 'animal model'). Also includes other functions to facilitate the use of animal models. Some functions have been created to be used in conjunction with the R package 'asreml' for the 'ASReml' software, which can be obtained upon purchase from 'VSN' international (<https://vsni.co.uk/software/asreml>).
Maintained by Matthew Wolak. Last updated 10 months ago.
3.5 match 20 stars 7.13 score 151 scripts 3 dependentsbioc
plyxp:Data masks for SummarizedExperiment enabling dplyr-like manipulation
The package provides `rlang` data masks for the SummarizedExperiment class. The enables the evaluation of unquoted expression in different contexts of the SummarizedExperiment object with optional access to other contexts. The goal for `plyxp` is for evaluation to feel like a data.frame object without ever needing to unwind to a rectangular data.frame.
Maintained by Justin Landis. Last updated 5 months ago.
annotationgenomeannotationtranscriptomics
5.0 match 4 stars 4.81 score 6 scriptsthibautjombart
adegenet:Exploratory Analysis of Genetic and Genomic Data
Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
Maintained by Zhian N. Kamvar. Last updated 1 months ago.
1.9 match 182 stars 12.60 score 1.9k scripts 29 dependentsyulab-smu
ggfun:Miscellaneous Functions for 'ggplot2'
Useful functions and utilities for 'ggplot' object (e.g., geometric layers, themes, and utilities to edit the object).
Maintained by Guangchuang Yu. Last updated 2 months ago.
2.3 match 18 stars 10.41 score 58 scripts 151 dependentshopkinsidd
phylosamp:Sample Size Calculations for Molecular and Phylogenetic Studies
Implements novel tools for estimating sample sizes needed for phylogenetic studies, including studies focused on estimating the probability of true pathogen transmission between two cases given phylogenetic linkage and studies focused on tracking pathogen variants at a population level. Methods described in Wohl, Giles, and Lessler (2021) and in Wohl, Lee, DiPrete, and Lessler (2023).
Maintained by Justin Lessler. Last updated 2 years ago.
3.5 match 12 stars 6.65 score 25 scriptssomalogic
SomaDataIO:Input/Output 'SomaScan' Data
Load and export 'SomaScan' data via the 'Standard BioTools, Inc.' structured text file called an ADAT ('*.adat'). For file format see <https://github.com/SomaLogic/SomaLogic-Data/blob/main/README.md>. The package also exports auxiliary functions for manipulating, wrangling, and extracting relevant information from an ADAT object once in memory.
Maintained by Caleb Scheidel. Last updated 1 months ago.
adatproteomicsproteomics-data-analysissomascan
3.0 match 26 stars 7.71 score 132 scriptsmolgenis
dsTidyverseClient:'DataSHIELD' 'Tidyverse' Clientside Package
Implementation of selected 'Tidyverse' functions within 'DataSHIELD', an open-source federated analysis solution in R. Currently, 'DataSHIELD' contains very limited tools for data manipulation, so the aim of this package is to improve the researcher experience by implementing essential functions for data manipulation, including subsetting, filtering, grouping, and renaming variables. This is the clientside package which should be installed locally, and is used in conjuncture with the serverside package 'dsTidyverse' which is installed on the remote server holding the data. For more information, see <https://www.tidyverse.org/>, <https://datashield.org/> and <https://github.com/molgenis/ds-tidyverse>.
Maintained by Tim Cadman. Last updated 17 days ago.
4.3 match 1 stars 5.43 score 2 scriptsreconverse
incidence2:Compute, Handle and Plot Incidence of Dated Events
Provides functions and classes to compute, handle and visualise incidence from dated events for a defined time interval. Dates can be provided in various standard formats. The class 'incidence2' is used to store computed incidence and can be easily manipulated, subsetted, and plotted.
Maintained by Tim Taylor. Last updated 5 days ago.
3.0 match 17 stars 7.67 score 104 scripts 1 dependentsasgr
imager:Image Processing Library Based on 'CImg'
Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.
Maintained by Aaron Robotham. Last updated 26 days ago.
1.7 match 17 stars 13.62 score 2.4k scripts 45 dependentsbioc
ELMER:Inferring Regulatory Element Landscapes and Transcription Factor Networks Using Cancer Methylomes
ELMER is designed to use DNA methylation and gene expression from a large number of samples to infere regulatory element landscape and transcription factor network in primary tissue.
Maintained by Tiago Chedraoui Silva. Last updated 5 months ago.
dnamethylationgeneexpressionmotifannotationsoftwaregeneregulationtranscriptionnetwork
3.1 match 7.42 score 176 scriptsbioc
safe:Significance Analysis of Function and Expression
SAFE is a resampling-based method for testing functional categories in gene expression experiments. SAFE can be applied to 2-sample and multi-class comparisons, or simple linear regressions. Other experimental designs can also be accommodated through user-defined functions.
Maintained by Ludwig Geistlinger. Last updated 5 months ago.
differentialexpressionpathwaysgenesetenrichmentstatisticalmethodsoftware
4.0 match 5.60 score 32 scripts 5 dependentsbioc
VERSO:Viral Evolution ReconStructiOn (VERSO)
Mutations that rapidly accumulate in viral genomes during a pandemic can be used to track the evolution of the virus and, accordingly, unravel the viral infection network. To this extent, sequencing samples of the virus can be employed to estimate models from genomic epidemiology and may serve, for instance, to estimate the proportion of undetected infected people by uncovering cryptic transmissions, as well as to predict likely trends in the number of infected, hospitalized, dead and recovered people. VERSO is an algorithmic framework that processes variants profiles from viral samples to produce phylogenetic models of viral evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a log-likelihood function. VERSO includes two separate and subsequent steps; in this package we provide an R implementation of VERSO STEP 1.
Maintained by Davide Maspero. Last updated 5 months ago.
biomedicalinformaticssequencingsomaticmutation
3.7 match 7 stars 6.05 scoremomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
3.0 match 51 stars 7.42 score 346 scriptsbioc
seq.hotSPOT:Targeted sequencing panel design based on mutation hotspots
seq.hotSPOT provides a resource for designing effective sequencing panels to help improve mutation capture efficacy for ultradeep sequencing projects. Using SNV datasets, this package designs custom panels for any tissue of interest and identify the genomic regions likely to contain the most mutations. Establishing efficient targeted sequencing panels can allow researchers to study mutation burden in tissues at high depth without the economic burden of whole-exome or whole-genome sequencing. This tool was developed to make high-depth sequencing panels to study low-frequency clonal mutations in clinically normal and cancerous tissues.
Maintained by Sydney Grant. Last updated 5 months ago.
softwaretechnologysequencingdnaseqwholegenome
5.6 match 4.00 score 3 scriptsthomasp85
particles:A Graph Based Particle Simulator Based on D3-Force
Simulating particle movement in 2D space has many application. The 'particles' package implements a particle simulator based on the ideas behind the 'd3-force' 'JavaScript' library. 'particles' implements all forces defined in 'd3-force' as well as others such as vector fields, traps, and attractors.
Maintained by Thomas Lin Pedersen. Last updated 3 months ago.
d3jsgraph-layoutnetworknetwork-visualizationparticlessimulationcpp
3.0 match 119 stars 7.19 score 43 scriptsstrohne
volker:High-Level Functions for Tabulating, Charting and Reporting Survey Data
Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.
Maintained by Jakob Jรผnger. Last updated 2 days ago.
3.0 match 5 stars 7.16 score 125 scriptsfawda123
rStrava:Access the 'Strava' API
Functions to access data from the 'Strava v3 API' <https://developers.strava.com/>.
Maintained by Marcus W. Beck. Last updated 5 months ago.
3.0 match 155 stars 7.15 score 57 scriptsstatisfactions
simpr:Flexible 'Tidyverse'-Friendly Simulations
A general, 'tidyverse'-friendly framework for simulation studies, design analysis, and power analysis. Specify data generation, define varying parameters, generate data, fit models, and tidy model results in a single pipeline, without needing loops or custom functions.
Maintained by Ethan Brown. Last updated 8 months ago.
3.0 match 43 stars 6.89 score 30 scriptsycroissant
dfidx:Indexed Data Frames
Provides extended data frames, with a special data frame column which contains two indexes, with potentially a nesting structure.
Maintained by Yves Croissant. Last updated 7 months ago.
3.0 match 2 stars 6.85 score 44 scripts 18 dependentsbioc
proActiv:Estimate Promoter Activity from RNA-Seq data
Most human genes have multiple promoters that control the expression of different isoforms. The use of these alternative promoters enables the regulation of isoform expression pre-transcriptionally. Alternative promoters have been found to be important in a wide number of cell types and diseases. proActiv is an R package that enables the analysis of promoters from RNA-seq data. proActiv uses aligned reads as input, and generates counts and normalized promoter activity estimates for each annotated promoter. In particular, proActiv accepts junction files from TopHat2 or STAR or BAM files as inputs. These estimates can then be used to identify which promoter is active, which promoter is inactive, and which promoters change their activity across conditions. proActiv also allows visualization of promoter activity across conditions.
Maintained by Joseph Lee. Last updated 5 months ago.
rnaseqgeneexpressiontranscriptionalternativesplicinggeneregulationdifferentialsplicingfunctionalgenomicsepigeneticstranscriptomicspreprocessingalternative-promotersgenomicspromoter-activitypromoter-annotationrna-seq-data
3.0 match 51 stars 6.66 score 15 scriptsbioc
survtype:Subtype Identification with Survival Data
Subtypes are defined as groups of samples that have distinct molecular and clinical features. Genomic data can be analyzed for discovering patient subtypes, associated with clinical data, especially for survival information. This package is aimed to identify subtypes that are both clinically relevant and biologically meaningful.
Maintained by Dongmin Jung. Last updated 5 months ago.
softwarestatisticalmethodgeneexpressionsurvivalclusteringsequencingcoverage
5.0 match 4.00 score 3 scriptscraddm
eegUtils:Utilities for Electroencephalographic (EEG) Analysis
Electroencephalography data processing and visualization tools. Includes import functions for 'BioSemi' (.BDF), 'Neuroscan' (.CNT), 'Brain Vision Analyzer' (.VHDR), 'EEGLAB' (.set) and 'Fieldtrip' (.mat). Many preprocessing functions such as referencing, epoching, filtering, and ICA are available. There are a variety of visualizations possible, including timecourse and topographical plotting.
Maintained by Matt Craddock. Last updated 5 months ago.
eegeeg-analysiseeg-dataeeg-signalseeg-signals-processingopenblascppopenmp
3.0 match 106 stars 6.54 score 82 scriptsburiom
denoiSeq:Differential Expression Analysis Using a Bottom-Up Model
Given count data from two conditions, it determines which transcripts are differentially expressed across the two conditions using Bayesian inference of the parameters of a bottom-up model for PCR amplification. This model is developed in Ndifon Wilfred, Hilah Gal, Eric Shifrut, Rina Aharoni, Nissan Yissachar, Nir Waysbort, Shlomit Reich Zeliger, Ruth Arnon, and Nir Friedman (2012), <http://www.pnas.org/content/109/39/15865.full>, and results in a distribution for the counts that is a superposition of the binomial and negative binomial distribution.
Maintained by Gershom Buri. Last updated 7 years ago.
5.3 match 3.70 score 10 scriptsliamrevell
learnPopGen:Population Genetic Simulations & Numerical Analysis
Conducts various numerical analyses and simulations in population genetics and evolutionary theory, primarily for the purpose of teaching (and learning about) key concepts in population & quantitative genetics, and evolutionary theory.
Maintained by Liam J. Revell. Last updated 2 years ago.
4.0 match 26 stars 4.82 score 51 scriptsstocnet
manynet:Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks
Many tools for making, modifying, mapping, marking, measuring, and motifs and memberships of many different types of networks. All functions operate with matrices, edge lists, and 'igraph', 'network', and 'tidygraph' objects, and on one-mode, two-mode (bipartite), and sometimes three-mode networks. The package includes functions for importing and exporting, creating and generating networks, modifying networks and node and tie attributes, and describing and visualizing networks with sensible defaults.
Maintained by James Hollway. Last updated 3 months ago.
diffusion-modelsgraphsnetwork-analysis
3.0 match 13 stars 6.41 score 35 scripts 1 dependentsbioc
DNABarcodes:A tool for creating and analysing DNA barcodes used in Next Generation Sequencing multiplexing experiments
The package offers a function to create DNA barcode sets capable of correcting insertion, deletion, and substitution errors. Existing barcodes can be analysed regarding their minimal, maximal and average distances between barcodes. Finally, reads that start with a (possibly mutated) barcode can be demultiplexed, i.e., assigned to their original reference barcode.
Maintained by Tilo Buschmann. Last updated 5 months ago.
preprocessingsequencingcppopenmp
4.3 match 4.51 score 27 scriptsqile0317
FastUtils:Fast, Readable Utility Functions
A wide variety of tools for general data analysis, wrangling, spelling, statistics, visualizations, package development, and more. All functions have vectorized implementations whenever possible. Exported names are designed to be readable, with longer names possessing short aliases.
Maintained by Qile Yang. Last updated 4 months ago.
scientific-computingutilitiesutilitycpp
3.9 match 2 stars 4.95 score 2 scriptsmattheaphy
actxps:Create Actuarial Experience Studies: Prepare Data, Summarize Results, and Create Reports
Experience studies are used by actuaries to explore historical experience across blocks of business and to inform assumption setting activities. This package provides functions for preparing data, creating studies, visualizing results, and beginning assumption development. Experience study methods, including exposure calculations, are described in: Atkinson & McGarry (2016) "Experience Study Calculations" <https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf>. The limited fluctuation credibility method used by the 'exp_stats()' function is described in: Herzog (1999, ISBN:1-56698-374-6) "Introduction to Credibility Theory".
Maintained by Matt Heaphy. Last updated 2 months ago.
3.0 match 14 stars 6.38 score 23 scriptshenningte
ir:Functions to Handle and Preprocess Infrared Spectra
Functions to import and handle infrared spectra (import from '.csv' and Thermo Galactic's '.spc', baseline correction, binning, clipping, interpolating, smoothing, averaging, adding, subtracting, dividing, multiplying, plotting).
Maintained by Henning Teickner. Last updated 3 years ago.
chemometricsinfraredinfrared-spectrair-packagemid-infrared-spectraspectroscopy
3.6 match 6 stars 5.32 score 35 scriptshope-data-science
tidyfst:Tidy Verbs for Fast Data Manipulation
A toolkit of tidy data manipulation verbs with 'data.table' as the backend. Combining the merits of syntax elegance from 'dplyr' and computing performance from 'data.table', 'tidyfst' intends to provide users with state-of-the-art data manipulation tools with least pain. This package is an extension of 'data.table'. While enjoying a tidy syntax, it also wraps combinations of efficient functions to facilitate frequently-used data operations.
Maintained by Tian-Yuan Huang. Last updated 6 months ago.
1.9 match 98 stars 10.09 score 118 scripts 4 dependentsreimand0
ActiveDriver:Finding Cancer Driver Proteins with Enriched Mutations in Post-Translational Modification Sites
A mutation analysis tool that discovers cancer driver genes with frequent mutations in protein signalling sites such as post-translational modifications (phosphorylation, ubiquitination, etc). The Poisson generalised linear regression model identifies genes where cancer mutations in signalling sites are more frequent than expected from the sequence of the entire gene. Integration of mutations with signalling information helps find new driver genes and propose candidate mechanisms to known drivers. Reference: Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers. Juri Reimand and Gary D Bader. Molecular Systems Biology (2013) 9:637 <doi:10.1038/msb.2012.68>.
Maintained by Juri Reimand. Last updated 8 years ago.
9.4 match 2.00 score 6 scriptshope-data-science
tidyft:Fast and Memory Efficient Data Operations in Tidy Syntax
Tidy syntax for 'data.table', using modification by reference whenever possible. This toolkit is designed for big data analysis in high-performance desktop or laptop computers. The syntax of the package is similar or identical to 'tidyverse'. It is user friendly, memory efficient and time saving. For more information, check its ancestor package 'tidyfst'.
Maintained by Tian-Yuan Huang. Last updated 6 months ago.
3.0 match 35 stars 6.25 score 34 scriptskbhoehn
dowser:B Cell Receptor Phylogenetics Toolkit
Provides a set of functions for inferring, visualizing, and analyzing B cell phylogenetic trees. Provides methods to 1) reconstruct unmutated ancestral sequences, 2) build B cell phylogenetic trees using multiple methods, 3) visualize trees with metadata at the tips, 4) reconstruct intermediate sequences, 5) detect biased ancestor-descendant relationships among metadata types Workflow examples available at documentation site (see URL). Citations: Hoehn et al (2022) <doi:10.1371/journal.pcbi.1009885>, Hoehn et al (2021) <doi:10.1101/2021.01.06.425648>.
Maintained by Kenneth Hoehn. Last updated 2 months ago.
2.8 match 6.81 score 84 scriptspoissonconsulting
mcmcdata:Manipulate MCMC Samples and Data Frames
Manipulates Monte Carlo Markov Chain samples and associated data frames.
Maintained by Joe Thorley. Last updated 2 months ago.
5.3 match 1 stars 3.56 score 4 scripts 4 dependentsr-lib
slider:Sliding Window Functions
Provides type-stable rolling window functions over any R data type. Cumulative and expanding windows are also supported. For more advanced usage, an index can be used as a secondary vector that defines how sliding windows are to be created.
Maintained by Davis Vaughan. Last updated 1 months ago.
1.3 match 302 stars 13.92 score 848 scripts 99 dependentsmhahsler
rMSA:Interface for Popular Multiple Sequence Alignment Tools
Seamlessly interfaces the Multiple Sequence Alignment software packages ClustalW, MAFFT, MUSCLE and Kalign (downloaded separately) and provides support to calcualte distances between sequences. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Maintained by Michael Hahsler. Last updated 10 months ago.
geneticssequencinginfrastructurealignmentbioinformaticssequence-alignment
4.9 match 12 stars 3.78 score 7 scriptsbioc
scoup:Simulate Codons with Darwinian Selection Modelled as an OU Process
An elaborate molecular evolutionary framework that facilitates straightforward simulation of codon genetic sequences subjected to different degrees and/or patterns of Darwinian selection. The model is built upon the fitness landscape paradigm of Sewall Wright, as popularised by the mutation-selection model of Halpern and Bruno. This enables realistic evolutionary process of living organisms to be reproducible seamlessly. For example, an Ornstein-Uhlenbeck fitness update algorithm is incorporated herein. Consequently, otherwise complex biological processes, such as the effect of the interplay between genetic drift and fitness landscape fluctuations on the inference of diversifying selection, may now be investigated with minimal effort. Frequency-dependent and stochastic fitness landscape update techniques are available.
Maintained by Hassan Sadiq. Last updated 2 months ago.
alignmentclassificationcomparativegenomicsdataimportgeneticsmathematicalbiologyresearchfieldsequencingsequencematchingsoftwarestatisticalmethodworkflowstep
4.0 match 4.60 score 8 scriptsbioc
TCGAutils:TCGA utility functions for data management
A suite of helper functions for checking and manipulating TCGA data including data obtained from the curatedTCGAData experiment package. These functions aim to simplify and make working with TCGA data more manageable. Exported functions include those that import data from flat files into Bioconductor objects, convert row annotations, and identifier translation via the GDC API.
Maintained by Marcel Ramos. Last updated 3 months ago.
softwareworkflowsteppreprocessingdataimportbioconductor-packagetcgau24ca289073utilities
1.9 match 26 stars 9.68 score 210 scripts 10 dependents