Showing 154 of total 154 results (show query)
talgalili
dendextend:Extending 'dendrogram' Functionality in R
Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.
Maintained by Tal Galili. Last updated 2 months ago.
191.7 match 154 stars 17.02 score 6.0k scripts 164 dependentsbioc
ComplexHeatmap:Make Complex Heatmaps
Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential patterns. Here the ComplexHeatmap package provides a highly flexible way to arrange multiple heatmaps and supports various annotation graphics.
Maintained by Zuguang Gu. Last updated 5 months ago.
softwarevisualizationsequencingclusteringcomplex-heatmapsheatmap
35.4 match 1.3k stars 16.93 score 16k scripts 151 dependentsropensci
phylogram:Dendrograms for Evolutionary Analysis
Contains functions for developing phylogenetic trees as deeply-nested lists ("dendrogram" objects). Enables bi-directional conversion between dendrogram and "phylo" objects (see Paradis et al (2004) <doi:10.1093/bioinformatics/btg412>), and features several tools for command-line tree manipulation and import/export via Newick parenthetic text.
Maintained by Shaun Wilkinson. Last updated 5 years ago.
34.5 match 11 stars 8.53 score 228 scripts 9 dependentstalgalili
heatmaply:Interactive Cluster Heat Maps Using 'plotly' and 'ggplot2'
Create interactive cluster 'heatmaps' that can be saved as a stand- alone HTML file, embedded in 'R Markdown' documents or in a 'Shiny' app, and available in the 'RStudio' viewer pane. Hover the mouse pointer over a cell to show details or drag a rectangle to zoom. A 'heatmap' is a popular graphical method for visualizing high-dimensional data, in which a table of numbers are encoded as a grid of colored cells. The rows and columns of the matrix are ordered to highlight patterns and are often accompanied by 'dendrograms'. 'Heatmaps' are used in many fields for visualizing observations, correlations, missing values patterns, and more. Interactive 'heatmaps' allow the inspection of specific value by hovering the mouse over a cell, as well as zooming into a region of the 'heatmap' by dragging a rectangle around the relevant area. This work is based on the 'ggplot2' and 'plotly.js' engine. It produces similar 'heatmaps' to 'heatmap.2' with the advantage of speed ('plotly.js' is able to handle larger size matrix), the ability to zoom from the 'dendrogram' panes, and the placing of factor variables in the sides of the 'heatmap'.
Maintained by Tal Galili. Last updated 8 months ago.
d3-heatmapdendextenddendrogramggplot2heatmapplotly
15.5 match 386 stars 14.21 score 2.0k scripts 45 dependentsandrie
ggdendro:Create Dendrograms and Tree Diagrams Using 'ggplot2'
This is a set of tools for dendrograms and tree plots using 'ggplot2'. The 'ggplot2' philosophy is to clearly separate data from the presentation. Unfortunately the plot method for dendrograms plots directly to a plot device without exposing the data. The 'ggdendro' package resolves this by making available functions that extract the dendrogram plot data. The package provides implementations for 'tree', 'rpart', as well as diana and agnes (from 'cluster') diagrams.
Maintained by Andrie de Vries. Last updated 3 months ago.
11.7 match 86 stars 13.54 score 3.9k scripts 62 dependentsyunuuuu
ggalign:A 'ggplot2' Extension for Consistent Axis Alignment
A 'ggplot2' extension offers various tools the creation of complex, multi-plot visualizations. Built on the familiar grammar of graphics, it provides intuitive tools to align and organize plots, making it ideal for complex visualizations. It excels in multi-omics research—such as genomics and microbiomes—by simplifying the visualization of intricate relationships between datasets, for example, linking genes to pathways. Whether you need to stack plots, arrange them around a central figure, or create a circular layout, 'ggalign' delivers flexibility and accuracy with minimal effort.
Maintained by Yun Peng. Last updated 4 hours ago.
complex-heatmapsdendrogramdendrogram-heatmapggplotggplot-extensionggplot2heatmapheatmap-visualizationheatmapsmarginal-plotsoncoplotoncoprinttanglegramupsetupsetplot
20.5 match 267 stars 7.08 score 27 scriptsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 2 days ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
6.1 match 581 stars 21.10 score 31k scripts 1.9k dependentsbioc
BatchQC:Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Maintained by Jessica McClintock. Last updated 5 months ago.
batcheffectgraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
14.1 match 7 stars 8.99 score 54 scriptsplangfelder
WGCNA:Weighted Correlation Network Analysis
Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.
Maintained by Peter Langfelder. Last updated 6 months ago.
12.5 match 54 stars 9.65 score 5.3k scripts 32 dependentsbioc
clusterExperiment:Compare Clusterings for Single-Cell Sequencing
Provides functionality for running and comparing many different clusterings of single-cell sequencing data or other large mRNA Expression data sets.
Maintained by Elizabeth Purdom. Last updated 5 months ago.
clusteringrnaseqsequencingsoftwaresinglecellcpp
12.0 match 39 stars 9.63 score 192 scripts 1 dependentsgluc
data.tree:General Purpose Hierarchical Data Structure
Create tree structures from hierarchical data, and traverse the tree in various orders. Aggregate, cumulate, print, plot, convert to and from data.frame and more. Useful for decision trees, machine learning, finance, conversion from and to JSON, and many other applications.
Maintained by Christoph Glur. Last updated 5 months ago.
7.5 match 209 stars 12.84 score 1.1k scripts 88 dependentsplangfelder
dynamicTreeCut:Methods for Detection of Clusters in Hierarchical Clustering Dendrograms
Contains methods for detection of clusters in hierarchical clustering dendrograms.
Maintained by Peter Langfelder. Last updated 9 years ago.
11.9 match 4 stars 7.52 score 492 scripts 59 dependentsmhahsler
dbscan:Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Related Algorithms
A fast reimplementation of several density-based algorithms of the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based spatial clustering of applications with noise) and HDBSCAN (hierarchical DBSCAN), the ordering algorithm OPTICS (ordering points to identify the clustering structure), shared nearest neighbor clustering, and the outlier detection algorithms LOF (local outlier factor) and GLOSH (global-local outlier score from hierarchies). The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided. Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>.
Maintained by Michael Hahsler. Last updated 2 months ago.
clusteringdbscandensity-based-clusteringhdbscanlofopticscpp
5.3 match 321 stars 15.62 score 1.6k scripts 84 dependentsncss-tech
sharpshootR:A Soil Survey Toolkit
A collection of data processing, visualization, and export functions to support soil survey operations. Many of the functions build on the `SoilProfileCollection` S4 class provided by the aqp package, extending baseline visualization to more elaborate depictions in the context of spatial and taxonomic data. While this package is primarily developed by and for the USDA-NRCS, in support of the National Cooperative Soil Survey, the authors strive for generalization sufficient to support any soil survey operation. Many of the included functions are used by the SoilWeb suite of websites and movile applications. These functions are provided here, with additional documentation, to enable others to replicate high quality versions of these figures for their own purposes.
Maintained by Dylan Beaudette. Last updated 12 days ago.
8.4 match 18 stars 8.37 score 327 scriptsjokergoo
circlize:Circular Visualization
Circular layout is an efficient way for the visualization of huge amounts of information. Here this package provides an implementation of circular layout generation in R as well as an enhancement of available software. The flexibility of the package is based on the usage of low-level graphics functions such that self-defined high-level graphics can be easily implemented by users for specific purposes. Together with the seamless connection between the powerful computational and visual environment in R, it gives users more convenience and freedom to design figures for better understanding complex patterns behind multiple dimensional data. The package is described in Gu et al. 2014 <doi:10.1093/bioinformatics/btu393>.
Maintained by Zuguang Gu. Last updated 1 years ago.
4.3 match 983 stars 15.62 score 10k scripts 213 dependentsjefferis
dendroextras:Extra Functions to Cut, Label and Colour Dendrogram Clusters
Provides extra functions to manipulate dendrograms that build on the base functions provided by the 'stats' package. The main functionality it is designed to add is the ability to colour all the edges in an object of class 'dendrogram' according to cluster membership i.e. each subtree is coloured, not just the terminal leaves. In addition it provides some utility functions to cut 'dendrogram' and 'hclust' objects and to set/get labels.
Maintained by Gregory Jefferis. Last updated 7 years ago.
12.7 match 4.61 score 90 scripts 3 dependentsevanbiederstedt
dendsort:Modular Leaf Ordering Methods for Dendrogram Nodes
An implementation of functions to optimize ordering of nodes in a dendrogram, without affecting the meaning of the dendrogram. A dendrogram can be sorted based on the average distance of subtrees, or based on the smallest distance value. These sorting methods improve readability and interpretability of tree structure, especially for tasks such as comparison of different distance measures or linkage types and identification of tight clusters and outliers. As a result, it also introduces more meaningful reordering for a coupled heatmap visualization. This method is described in "dendsort: modular leaf ordering methods for dendrogram representations in R", F1000Research 2014, 3: 177 <doi:10.12688/f1000research.4784.1>.
Maintained by Evan Biederstedt. Last updated 4 years ago.
7.5 match 4 stars 7.01 score 472 scripts 3 dependentsbioc
ReducedExperiment:Containers and tools for dimensionally-reduced -omics representations
Provides SummarizedExperiment-like containers for storing and manipulating dimensionally-reduced assay data. The ReducedExperiment classes allow users to simultaneously manipulate their original dataset and their decomposed data, in addition to other method-specific outputs like feature loadings. Implements utilities and specialised classes for the application of stabilised independent component analysis (sICA) and weighted gene correlation network analysis (WGCNA).
Maintained by Jack Gisby. Last updated 2 months ago.
geneexpressioninfrastructuredatarepresentationsoftwaredimensionreductionnetworkbioconductor-packagebioinformaticsdimensionality-reduction
10.1 match 3 stars 5.18 score 8 scriptsbioc
DECIPHER:Tools for curating, analyzing, and manipulating biological sequences
A toolset for deciphering and managing biological sequences.
Maintained by Erik Wright. Last updated 5 days ago.
clusteringgeneticssequencingdataimportvisualizationmicroarrayqualitycontrolqpcralignmentwholegenomemicrobiomeimmunooncologygenepredictionopenmp
5.5 match 8.40 score 1.1k scripts 14 dependentspaleolimbot
tidypaleo:Tidy Tools for Paleoenvironmental Archives
Provides a set of functions with a common framework for age-depth model management, stratigraphic visualization, and common statistical transformations. The focus of the package is stratigraphic visualization, for which 'ggplot2' components are provided to reproduce the scales, geometries, facets, and theme elements commonly used in publication-quality stratigraphic diagrams. Helpers are also provided to reproduce the exploratory statistical summaries that are frequently included on stratigraphic diagrams. See Dunnington et al. (2021) <doi:10.18637/jss.v101.i07>.
Maintained by Dewey Dunnington. Last updated 2 years ago.
6.8 match 34 stars 6.59 score 38 scriptsmd-anderson-bioinformatics
NGCHM:Next Generation Clustered Heat Maps
Next-Generation Clustered Heat Maps (NG-CHMs) allow for dynamic exploration of heat map data in a web browser. 'NGCHM' allows users to create both stand-alone HTML files containing a Next-Generation Clustered Heat Map, and .ngchm files to view in the NG-CHM viewer. See Ryan MC, Stucky M, et al (2020) <doi:10.12688/f1000research.20590.2> for more details.
Maintained by Mary A Rohrdanz. Last updated 8 days ago.
7.5 match 9 stars 5.48 score 28 scriptsbioc
SynExtend:Tools for Working With Synteny Objects
Shared order between genomic sequences provide a great deal of information. Synteny objects produced by the R package DECIPHER provides quantitative information about that shared order. SynExtend provides tools for extracting information from Synteny objects.
Maintained by Nicholas Cooley. Last updated 2 days ago.
geneticsclusteringcomparativegenomicsdataimportfortranopenmp
6.1 match 1 stars 6.42 score 77 scriptsplotly
plotly:Create Interactive Web Graphics via 'plotly.js'
Create interactive web graphics from 'ggplot2' graphs and/or a custom interface to the (MIT-licensed) JavaScript library 'plotly.js' inspired by the grammar of graphics.
Maintained by Carson Sievert. Last updated 3 months ago.
d3jsdata-visualizationggplot2javascriptplotlyshinywebgl
2.0 match 2.6k stars 19.43 score 93k scripts 797 dependentsuclahs-cds
BoutrosLab.plotting.general:Functions to Create Publication-Quality Plots
Contains several plotting functions such as barplots, scatterplots, heatmaps, as well as functions to combine plots and assist in the creation of these plots. These functions will give users great ease of use and customization options in broad use for biomedical applications, as well as general purpose plotting. Each of the functions also provides valid default settings to make plotting data more efficient and producing high quality plots with standard colour schemes simpler. All functions within this package are capable of producing plots that are of the quality to be presented in scientific publications and journals. P'ng et al.; BPG: Seamless, automated and interactive visualization of scientific data; BMC Bioinformatics 2019 <doi:10.1186/s12859-019-2610-2>.
Maintained by Paul Boutros. Last updated 5 months ago.
4.5 match 12 stars 8.36 score 414 scripts 6 dependentshardin47
biwt:Functions to Compute the Biweight Mean Vector and Covariance and Correlation Matrices
The base functions compute multivariate location, scale, and correlation estimates based on Tukey's biweight M-estimator. Using the base function, the computations can be applied to a large number of observations to create either a matrix of biweight distances or biweight correlations.
Maintained by Johanna Hardin. Last updated 6 months ago.
6.5 match 5.58 score 16 scripts 2 dependentsevanbiederstedt
gapmap:Drawing Gapped Cluster Heatmaps with 'ggplot2'
The gap encodes the distance between clusters and improves interpretation of cluster heatmaps. The gaps can be of the same distance based on a height threshold to cut the dendrogram. Another option is to vary the size of gaps based on the distance between clusters.
Maintained by Evan Biederstedt. Last updated 1 years ago.
7.8 match 2 stars 4.62 score 21 scriptstpook92
MoBPS:Modular Breeding Program Simulator
Framework for the simulation framework for the simulation of complex breeding programs and compare their economic and genetic impact. The package is also used as the background simulator for our a web-based interface <http:www.mobps.de>. Associated publication: Pook et al. (2020) <doi:10.1534/g3.120.401193>.
Maintained by Torsten Pook. Last updated 3 years ago.
15.1 match 2.35 score 45 scriptse-sensing
sits:Satellite Image Time Series Analysis for Earth Observation Data Cubes
An end-to-end toolkit for land use and land cover classification using big Earth observation data, based on machine learning methods applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>. Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, Brazil Data Cube, Copernicus Data Space Environment (CDSE), Digital Earth Africa, Digital Earth Australia, NASA HLS using the Spatio-temporal Asset Catalog (STAC) protocol (<https://stacspec.org/>) and the 'gdalcubes' R package developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>. Supports visualization methods for images and time series and smoothing filters for dealing with noisy time series. Includes functions for quality assessment of training samples using self-organized maps as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. Includes methods to reduce training samples imbalance proposed by Chawla et al (2002) <doi:10.1613/jair.953>. Provides machine learning methods including support vector machines, random forests, extreme gradient boosting, multi-layer perceptrons, temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, and temporal attention encoders by Garnot and Landrieu (2020) <doi:10.48550/arXiv.2007.00586>. Supports GPU processing of deep learning models using torch <https://torch.mlverse.org/>. Performs efficient classification of big Earth observation data cubes and includes functions for post-classification smoothing based on Bayesian inference as described by Camara et al (2024) <doi:10.3390/rs16234572>, and methods for active learning and uncertainty assessment. Supports region-based time series analysis using package supercells <https://jakubnowosad.com/supercells/>. Enables best practices for estimating area and assessing accuracy of land change as recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>. Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.
Maintained by Gilberto Camara. Last updated 1 months ago.
big-earth-datacbersearth-observationeo-datacubesgeospatialimage-time-seriesland-cover-classificationlandsatplanetary-computerr-spatialremote-sensingrspatialsatellite-image-time-seriessatellite-imagerysentinel-2stac-apistac-catalogcpp
3.7 match 494 stars 9.50 score 384 scriptsteunbrand
legendry:Extended Legends and Axes for 'ggplot2'
A 'ggplot2' extension that focusses on expanding the plotter's arsenal of guides. Guides in 'ggplot2' include axes and legends. 'legendry' offers new axes and annotation options, as well as new legends and colour displays.
Maintained by Teun van den Brand. Last updated 11 days ago.
axisaxis-customizationggplot-extensionggplot2legendvisualization
4.5 match 227 stars 7.83 score 29 scripts 2 dependentstsieger
idendr0:Interactive Dendrograms
Interactive dendrogram that enables the user to select and color clusters, to zoom and pan the dendrogram, and to visualize the clustered data not only in a built-in heat map, but also in 'GGobi' interactive plots and user-supplied plots. This is a backport of Qt-based 'idendro' (<https://github.com/tsieger/idendro>) to base R graphics and Tcl/Tk GUI.
Maintained by Tomas Sieger. Last updated 4 years ago.
9.0 match 7 stars 3.89 score 22 scriptsbioc
Heatplus:Heatmaps with row and/or column covariates and colored clusters
Display a rectangular heatmap (intensity plot) of a data matrix. By default, both samples (columns) and features (row) of the matrix are sorted according to a hierarchical clustering, and the corresponding dendrogram is plotted. Optionally, panels with additional information about samples and features can be added to the plot.
Maintained by Alexander Ploner. Last updated 5 months ago.
4.6 match 7.63 score 94 scripts 5 dependentsjacobbien
protoclust:Hierarchical Clustering with Prototypes
Performs minimax linkage hierarchical clustering. Every cluster has an associated prototype element that represents that cluster as described in Bien, J., and Tibshirani, R. (2011), "Hierarchical Clustering with Prototypes via Minimax Linkage," The Journal of the American Statistical Association, 106(495), 1075-1084.
Maintained by Jacob Bien. Last updated 3 years ago.
5.8 match 7 stars 5.64 score 50 scripts 4 dependentsbioc
TreeAndLeaf:Displaying binary trees with focus on dendrogram leaves
The TreeAndLeaf package combines unrooted and force-directed graph algorithms in order to layout binary trees, aiming to represent multiple layers of information onto dendrogram leaves.
Maintained by Milena A. Cardoso. Last updated 5 months ago.
infrastructuregraphandnetworksoftwarenetworkvisualizationdatarepresentation
7.6 match 4.20 score 16 scriptsmhahsler
seriation:Infrastructure for Ordering Objects Using Seriation
Infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.
Maintained by Michael Hahsler. Last updated 3 months ago.
combinatorial-optimizationordinationseriationfortran
2.3 match 77 stars 14.07 score 640 scripts 79 dependentsthomasp85
ggraph:An Implementation of Grammar of Graphics for Graphs and Networks
The grammar of graphics as implemented in ggplot2 is a poor fit for graph and network visualizations due to its reliance on tabular data input. ggraph is an extension of the ggplot2 API tailored to graph visualizations and provides the same flexible approach to building up plots layer by layer.
Maintained by Thomas Lin Pedersen. Last updated 1 years ago.
ggplot-extensionggplot2graph-visualizationnetwork-visualizationvisualizationcpp
1.9 match 1.1k stars 16.96 score 9.2k scripts 111 dependentsbioc
ggtreeDendro:Drawing 'dendrogram' using 'ggtree'
Offers a set of 'autoplot' methods to visualize tree-like structures (e.g., hierarchical clustering and classification/regression trees) using 'ggtree'. You can adjust graphical parameters using grammar of graphic syntax and integrate external data to the tree.
Maintained by Guangchuang Yu. Last updated 5 months ago.
clusteringclassificationdecisiontreephylogeneticsvisualization
7.6 match 4.18 score 10 scriptsbethatkinson
rpart:Recursive Partitioning and Regression Trees
Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman, Friedman, Olshen and Stone.
Maintained by Beth Atkinson. Last updated 8 months ago.
1.9 match 52 stars 16.59 score 18k scripts 1.6k dependentschristophergandrud
networkD3:D3 JavaScript Network Graphs from R
Creates 'D3' 'JavaScript' network, tree, dendrogram, and Sankey graphs from 'R'.
Maintained by Christopher Gandrud. Last updated 6 years ago.
2.3 match 654 stars 13.55 score 3.4k scripts 31 dependentsuiowa-applied-topology
mappeR:Construct and Visualize TDA Mapper Graphs
Topological data analysis (TDA) is a method of data analysis that uses techniques from topology to analyze high-dimensional data. Here we implement Mapper, an algorithm from this area developed by Singh, Mémoli and Carlsson (2007) which generalizes the concept of a Reeb graph <https://en.wikipedia.org/wiki/Reeb_graph>.
Maintained by George Clare Kennedy. Last updated 24 days ago.
7.4 match 2 stars 4.05 score 14 scriptschristophergandrud
d3Network:The Old Package for Creating D3 JavaScript Network, Tree, Dendrogram, and Sankey Graphs
!!! NOTE: Active development has moved to the networkD3 package. !!!
Maintained by Christopher Gandrud. Last updated 10 years ago.
4.5 match 172 stars 6.63 score 82 scriptszwdzwd
wheatmap:Incrementally Build Complex Plots using Natural Semantics
Builds complex plots, heatmaps in particular, using natural semantics. Bigger plots can be assembled using directives such as 'LeftOf', 'RightOf', 'TopOf', and 'Beneath' and more. Other features include clustering, dendrograms and integration with 'ggplot2' generated grid objects. This package is particularly designed for bioinformaticians to assemble complex plots for publication.
Maintained by Wanding Zhou. Last updated 3 years ago.
4.6 match 10 stars 6.35 score 50 scripts 3 dependentsubod
apcluster:Affinity Propagation Clustering
Implements Affinity Propagation clustering introduced by Frey and Dueck (2007) <DOI:10.1126/science.1136800>. The algorithms are largely analogous to the 'Matlab' code published by Frey and Dueck. The package further provides leveraged affinity propagation and an algorithm for exemplar-based agglomerative clustering that can also be used to join clusters obtained from affinity propagation. Various plotting functions are available for analyzing clustering results.
Maintained by Ulrich Bodenhofer. Last updated 11 months ago.
3.0 match 10 stars 9.82 score 270 scripts 25 dependentsbioc
SNPRelate:Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data
Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.
Maintained by Xiuwen Zheng. Last updated 5 months ago.
infrastructuregeneticsstatisticalmethodprincipalcomponentbioinformaticsgds-formatpcasimdsnpopenblascpp
2.3 match 104 stars 12.69 score 1.6k scripts 18 dependentskassambara
factoextra:Extract and Visualize the Results of Multivariate Data Analyses
Provides some easy-to-use functions to extract and visualize the output of multivariate data analyses, including 'PCA' (Principal Component Analysis), 'CA' (Correspondence Analysis), 'MCA' (Multiple Correspondence Analysis), 'FAMD' (Factor Analysis of Mixed Data), 'MFA' (Multiple Factor Analysis) and 'HMFA' (Hierarchical Multiple Factor Analysis) functions from different R packages. It contains also functions for simplifying some clustering analysis steps and provides 'ggplot2' - based elegant data visualization.
Maintained by Alboukadel Kassambara. Last updated 5 years ago.
2.0 match 363 stars 14.13 score 15k scripts 52 dependentscbhurley
DendSer:Dendrogram Seriation: Ordering for Visualisation
Re-arranges a dendrogram to optimize visualisation-based cost functions.
Maintained by Catherine Hurley. Last updated 3 years ago.
7.5 match 3.74 score 27 scripts 5 dependentsalextkalinka
linkcomm:Tools for Generating, Visualizing, and Analysing Link Communities in Networks
Link communities reveal the nested and overlapping structure in networks, and uncover the key nodes that form connections to multiple communities. linkcomm provides a set of tools for generating, visualizing, and analysing link communities in networks of arbitrary size and type. The linkcomm package also includes tools for generating, visualizing, and analysing Overlapping Cluster Generator (OCG) communities. Kalinka and Tomancak (2011) <doi:10.1093/bioinformatics/btr311>.
Maintained by Alex T. Kalinka. Last updated 4 years ago.
clusteringnetworksnetworks-biologyvisualizationcpp
3.6 match 7 stars 7.53 score 115 scripts 4 dependentskharchenkolab
leidenAlg:Implements the Leiden Algorithm via an R Interface
An R interface to the Leiden algorithm, an iterative community detection algorithm on networks. The algorithm is designed to converge to a partition in which all subsets of all communities are locally optimally assigned, yielding communities guaranteed to be connected. The implementation proves to be fast, scales well, and can be run on graphs of millions of nodes (as long as they can fit in memory). The original implementation was constructed as a python interface "leidenalg" found here: <https://github.com/vtraag/leidenalg>. The algorithm was originally described in Traag, V.A., Waltman, L. & van Eck, N.J. "From Louvain to Leiden: guaranteeing well-connected communities". Sci Rep 9, 5233 (2019) <doi:10.1038/s41598-019-41695-z>.
Maintained by Evan Biederstedt. Last updated 5 months ago.
4.1 match 9 stars 5.87 score 28 scripts 5 dependentsbioc
HGC:A fast hierarchical graph-based clustering method
HGC (short for Hierarchical Graph-based Clustering) is an R package for conducting hierarchical clustering on large-scale single-cell RNA-seq (scRNA-seq) data. The key idea is to construct a dendrogram of cells on their shared nearest neighbor (SNN) graph. HGC provides functions for building graphs and for conducting hierarchical clustering on the graph. The users with old R version could visit https://github.com/XuegongLab/HGC/tree/HGC4oldRVersion to get HGC package built for R 3.6.
Maintained by XGlab. Last updated 5 months ago.
singlecellsoftwareclusteringrnaseqgraphandnetworkdnaseqcpp
5.0 match 4.70 score 25 scriptsbioc
netboost:Network Analysis Supported by Boosting
Boosting supported network analysis for high-dimensional omics applications. This package comes bundled with the MC-UPGMA clustering package by Yaniv Loewenstein.
Maintained by Pascal Schlosser. Last updated 5 months ago.
softwarestatisticalmethodgraphandnetworknetworkclusteringdimensionreductionbiomedicalinformaticsepigeneticsmetabolomicstranscriptomicscpp
5.4 match 4.18 score 1 scriptsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 28 days ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
1.9 match 55 stars 11.77 score 1.2k scripts 2 dependentsmassimoaria
bibliometrix:Comprehensive Science Mapping Analysis
Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.
Maintained by Massimo Aria. Last updated 7 days ago.
bibliometric-analysisbibliometricscitationcitation-networkcitationsco-authorsco-occurenceco-word-analysiscorrespondence-analysiscouplingisi-webjournalmanuscriptquantitative-analysisscholarssciencescience-mappingscientificscientometricsscopus
1.8 match 545 stars 12.54 score 518 scripts 2 dependentsluca-scr
mclust:Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation
Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.
Maintained by Luca Scrucca. Last updated 11 months ago.
1.8 match 21 stars 12.23 score 6.6k scripts 587 dependentsbioc
celda:CEllular Latent Dirichlet Allocation
Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.
Maintained by Joshua Campbell. Last updated 27 days ago.
singlecellgeneexpressionclusteringsequencingbayesianimmunooncologydataimportcppopenmp
1.9 match 147 stars 10.47 score 256 scripts 2 dependentsbioc
CluMSID:Clustering of MS2 Spectra for Metabolite Identification
CluMSID is a tool that aids the identification of features in untargeted LC-MS/MS analysis by the use of MS2 spectra similarity and unsupervised statistical methods. It offers functions for a complete and customisable workflow from raw data to visualisations and is interfaceable with the xmcs family of preprocessing packages.
Maintained by Tobias Depke. Last updated 5 months ago.
metabolomicspreprocessingclustering
3.2 match 10 stars 6.04 score 22 scriptsr-forge
latticeExtra:Extra Graphical Utilities Based on Lattice
Building on the infrastructure provided by the lattice package, this package provides several new high-level functions and methods, as well as additional utilities such as panel and axis annotation functions.
Maintained by Deepayan Sarkar. Last updated 3 years ago.
1.9 match 10.18 score 2.6k scripts 233 dependentsr-hyperspec
hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)
Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
Maintained by Claudia Beleites. Last updated 10 months ago.
data-wranglinghyperspectralimaginginfrarednmrramanspectroscopyuv-visxrf
2.3 match 16 stars 8.13 score 233 scripts 2 dependentsjonasrieger
ldaPrototype:Prototype of Multiple Latent Dirichlet Allocation Runs
Determine a Prototype from a number of runs of Latent Dirichlet Allocation (LDA) measuring its similarities with S-CLOP: A procedure to select the LDA run with highest mean pairwise similarity, which is measured by S-CLOP (Similarity of multiple sets by Clustering with Local Pruning), to all other runs. LDA runs are specified by its assignments leading to estimators for distribution parameters. Repeated runs lead to different results, which we encounter by choosing the most representative LDA run as prototype.
Maintained by Jonas Rieger. Last updated 2 years ago.
latent-dirichlet-allocationldamodel-selectionmodelselectionreliabilitytext-miningtextdatatopic-modeltopic-modelstopic-similaritiestopicmodelingtopicmodelling
4.0 match 8 stars 4.44 score 23 scripts 1 dependentsgrunwaldlab
poppr:Genetic Analysis of Populations with Mixed Reproduction
Population genetic analyses for hierarchical analysis of partially clonal populations built upon the architecture of the 'adegenet' package. Originally described in Kamvar, Tabima, and Grünwald (2014) <doi:10.7717/peerj.281> with version 2.0 described in Kamvar, Brooks, and Grünwald (2015) <doi:10.3389/fgene.2015.00208>.
Maintained by Zhian N. Kamvar. Last updated 10 months ago.
clonalitygenetic-analysisgenetic-distancesminimum-spanning-networksmultilocus-genotypesmultilocus-lineagespopulation-geneticspopulationsopenmp
1.7 match 69 stars 10.84 score 672 scriptskharchenkolab
sccore:Core Utilities for Single-Cell RNA-Seq
Core utilities for single-cell RNA-seq data analysis. Contained within are utility functions for working with differential expression (DE) matrices and count matrices, a collection of functions for manipulating and plotting data via 'ggplot2', and functions to work with cell graphs and cell embeddings. Graph-based methods include embedding kNN cell graphs into a UMAP <doi:10.21105/joss.00861>, collapsing vertices of each cluster in the graph, and propagating graph labels.
Maintained by Evan Biederstedt. Last updated 1 years ago.
2.8 match 12 stars 6.44 score 36 scripts 9 dependentsbioc
goProfiles:goProfiles: an R package for the statistical analysis of functional profiles
The package implements methods to compare lists of genes based on comparing the corresponding 'functional profiles'.
Maintained by Alex Sanchez. Last updated 5 months ago.
annotationgogeneexpressiongenesetenrichmentgraphandnetworkmicroarraymultiplecomparisonpathwayssoftware
3.2 match 5.48 score 6 scripts 1 dependentsjokergoo
spiralize:Visualize Data on Spirals
It visualizes data along an Archimedean spiral <https://en.wikipedia.org/wiki/Archimedean_spiral>, makes so-called spiral graph or spiral chart. It has two major advantages for visualization: 1. It is able to visualize data with very long axis with high resolution. 2. It is efficient for time series data to reveal periodic patterns.
Maintained by Zuguang Gu. Last updated 9 months ago.
2.3 match 148 stars 7.67 score 35 scripts 3 dependentskarlines
diagram:Functions for Visualising Simple Graphs (Networks), Plotting Flow Diagrams
Visualises simple graphs (networks) based on a transition matrix, utilities to plot flow diagrams, visualising webs, electrical networks, etc. Support for the book "A practical guide to ecological modelling - using R as a simulation platform" by Karline Soetaert and Peter M.J. Herman (2009), Springer. and the book "Solving Differential Equations in R" by Karline Soetaert, Jeff Cash and Francesca Mazzia (2012), Springer. Includes demo(flowchart), demo(plotmat), demo(plotweb).
Maintained by Karline Soetaert. Last updated 4 years ago.
1.7 match 10.06 score 598 scripts 487 dependentsbioboot
bio3d:Biological Structure Analysis
Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information.
Maintained by Barry Grant. Last updated 5 months ago.
2.0 match 5 stars 8.49 score 1.4k scripts 10 dependentsbioc
goSorensen:Statistical inference based on the Sorensen-Dice dissimilarity and the Gene Ontology (GO)
This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.
Maintained by Pablo Flores. Last updated 5 months ago.
annotationgogenesetenrichmentsoftwaremicroarraypathwaysgeneexpressionmultiplecomparisongraphandnetworkreactomeclusteringkegg
3.7 match 4.56 score 12 scriptspcruniversum
RDML:Importing Real-Time Thermo Cycler (qPCR) Data from RDML Format Files
Imports real-time thermo cycler (qPCR) data from Real-time PCR Data Markup Language (RDML) and transforms to the appropriate formats of the 'qpcR' and 'chipPCR' packages. Contains a dendrogram visualization for the structure of RDML object and GUI for RDML editing.
Maintained by Konstantin A. Blagodatskikh. Last updated 7 months ago.
2.3 match 21 stars 7.16 score 58 scripts 1 dependentsbioc
pRoloc:A unifying bioinformatics framework for spatial proteomics
The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.
Maintained by Lisa Breckels. Last updated 26 days ago.
immunooncologyproteomicsmassspectrometryclassificationclusteringqualitycontrolbioconductorproteomics-dataspatial-proteomicsvisualisationopenblascpp
1.9 match 15 stars 8.71 score 101 scripts 2 dependentscmmr
rbiom:Read/Write, Analyze, and Visualize 'BIOM' Data
A toolkit for working with Biological Observation Matrix ('BIOM') files. Read/write all 'BIOM' formats. Compute rarefaction, alpha diversity, and beta diversity (including 'UniFrac'). Summarize counts by taxonomic level. Subset based on metadata. Generate visualizations and statistical analyses. CPU intensive operations are coded in C for speed.
Maintained by Daniel P. Smith. Last updated 6 days ago.
1.8 match 15 stars 9.02 score 117 scripts 6 dependentskharchenkolab
pagoda2:Single Cell Analysis and Differential Expression
Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.
Maintained by Evan Biederstedt. Last updated 1 years ago.
scrna-seqsingle-cellsingle-cell-rna-seqtranscriptomicsopenblascppopenmp
2.0 match 222 stars 8.00 score 282 scriptseahouseman
RPMM:Recursively Partitioned Mixture Model
Recursively Partitioned Mixture Model for Beta and Gaussian Mixtures. This is a model-based clustering algorithm that returns a hierarchy of classes, similar to hierarchical clustering, but also similar to finite mixture models.
Maintained by E. Andres Houseman. Last updated 8 years ago.
3.6 match 4.34 score 78 scripts 7 dependentsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 6 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
2.0 match 33 stars 7.77 score 10 scriptsr-forge
ClassDiscovery:Classes and Methods for "Class Discovery" with Microarrays or Proteomics
Defines the classes used for "class discovery" problems in the OOMPA project (<http://oompa.r-forge.r-project.org/>). Class discovery primarily consists of unsupervised clustering methods with attempts to assess their statistical significance.
Maintained by Kevin R. Coombes. Last updated 1 months ago.
1.8 match 8.53 score 85 scripts 9 dependentsstemangiola
tidyHeatmap:A Tidy Implementation of Heatmap
This is a tidy implementation for heatmap. At the moment it is based on the (great) package 'ComplexHeatmap'. The goal of this package is to interface a tidy data frame with this powerful tool. Some of the advantages are: Row and/or columns colour annotations are easy to integrate just specifying one parameter (column names). Custom grouping of rows is easy to specify providing a grouped tbl. For example: df %>% group_by(...). Labels size adjusted by row and column total number. Default use of Brewer and Viridis palettes.
Maintained by Stefano Mangiola. Last updated 1 months ago.
assaydomaininfrastructurebrewercomplexheatmapcustom-palettedplyrgraphvizheatmapmtcarsplottingrstudioscaletibbletidytidy-data-frametidybulktidyverseviridis
1.5 match 335 stars 10.23 score 197 scripts 1 dependentsuclahs-cds
CancerEvolutionVisualization:Publication Quality Phylogenetic Tree Plots
Generates tree plots with precise branch lengths, gene annotations, and cellular prevalence. The package handles complex tree structures (angles, lengths, etc.) and can be further refined as needed by the user.
Maintained by Paul Boutros. Last updated 1 days ago.
2.4 match 2 stars 6.34 score 5 scriptsbioc
GeneTonic:Enjoy Analyzing And Integrating The Results From Differential Expression Analysis And Functional Enrichment Analysis
This package provides functionality to combine the existing pieces of the transcriptome data and results, making it easier to generate insightful observations and hypothesis. Its usage is made easy with a Shiny application, combining the benefits of interactivity and reproducibility e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist. Using the GeneTonicList as a standardized container for all the required components, it is possible to simplify the generation of multiple visualizations and summaries.
Maintained by Federico Marini. Last updated 2 months ago.
guigeneexpressionsoftwaretranscriptiontranscriptomicsvisualizationdifferentialexpressionpathwaysreportwritinggenesetenrichmentannotationgoshinyappsbioconductorbioconductor-packagedata-explorationdata-visualizationfunctional-enrichment-analysisgene-expressionpathway-analysisreproducible-researchrna-seq-analysisrna-seq-datashinytranscriptomeuser-friendly
1.8 match 77 stars 8.28 score 37 scripts 1 dependentswencke
GOplot:Visualization of Functional Analysis Data
Implementation of multilayered visualizations for enhanced graphical representation of functional analysis data. It combines and integrates omics data derived from expression and functional annotation enrichment analyses. Its plotting functions have been developed with an hierarchical structure in mind: starting from a general overview to identify the most enriched categories (modified bar plot, bubble plot) to a more detailed one displaying different types of relevant information for the molecules in a given set of categories (circle plot, chord plot, cluster plot, Venn diagram, heatmap).
Maintained by Wencke Walter. Last updated 8 years ago.
2.3 match 20 stars 6.60 score 235 scriptsr-suzuki
pvclust:Hierarchical Clustering with P-Values via Multiscale Bootstrap Resampling
An implementation of multiscale bootstrap resampling for assessing the uncertainty in hierarchical cluster analysis. It provides SI (selective inference) p-value, AU (approximately unbiased) p-value and BP (bootstrap probability) value for each cluster in a dendrogram.
Maintained by Ryota Suzuki. Last updated 5 years ago.
2.3 match 5 stars 6.54 score 784 scripts 9 dependentsshaunpwilkinson
aphid:Analysis with Profile Hidden Markov Models
Designed for the development and application of hidden Markov models and profile HMMs for biological sequence analysis. Contains functions for multiple and pairwise sequence alignment, model construction and parameter optimization, file import/export, implementation of the forward, backward and Viterbi algorithms for conditional sequence probabilities, tree-based sequence weighting, and sequence simulation. Features a wide variety of potential applications including database searching, gene-finding and annotation, phylogenetic analysis and sequence classification. Based on the models and algorithms described in Durbin et al (1998, ISBN: 9780521629713).
Maintained by Shaun Wilkinson. Last updated 8 months ago.
2.3 match 22 stars 6.58 score 38 scripts 3 dependentsbioc
BioNERO:Biological Network Reconstruction Omnibus
BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.
Maintained by Fabricio Almeida-Silva. Last updated 5 months ago.
softwaregeneexpressiongeneregulationsystemsbiologygraphandnetworkpreprocessingnetworknetworkinference
1.9 match 27 stars 7.78 score 50 scripts 1 dependentsbioc
simplifyEnrichment:Simplify Functional Enrichment Results
A new clustering algorithm, "binary cut", for clustering similarity matrices of functional terms is implemeted in this package. It also provides functions for visualizing, summarizing and comparing the clusterings.
Maintained by Zuguang Gu. Last updated 5 months ago.
softwarevisualizationgoclusteringgenesetenrichment
1.8 match 113 stars 8.02 score 196 scriptshiweller
recolorize:Color-Based Image Segmentation
Automatic, semi-automatic, and manual functions for generating color maps from images. The idea is to simplify the colors of an image according to a metric that is useful for the user, using deterministic methods whenever possible. Many images will be clustered well using the out-of-the-box functions, but the package also includes a toolbox of functions for making manual adjustments (layer merging/isolation, blurring, fitting to provided color clusters or those from another image, etc). Also includes export methods for other color/pattern analysis packages (pavo, patternize, colordistance).
Maintained by Hannah Weller. Last updated 13 days ago.
1.9 match 39 stars 7.68 score 87 scriptsbioc
spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions
The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.
Maintained by Jianhai Zhang. Last updated 4 months ago.
spatialvisualizationmicroarraysequencinggeneexpressiondatarepresentationnetworkclusteringgraphandnetworkcellbasedassaysatacseqdnaseqtissuemicroarraysinglecellcellbiologygenetarget
2.3 match 5 stars 6.26 score 12 scriptsbioc
structToolbox:Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Maintained by Gavin Rhys Lloyd. Last updated 25 days ago.
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
2.3 match 10 stars 6.26 score 12 scriptsbioc
cola:A Framework for Consensus Partitioning
Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.
Maintained by Zuguang Gu. Last updated 1 months ago.
clusteringgeneexpressionclassificationsoftwareconsensus-clusteringcpp
1.8 match 61 stars 7.49 score 112 scriptschavent
ClustOfVar:Clustering of Variables
Cluster analysis of a set of variables. Variables can be quantitative, qualitative or a mixture of both.
Maintained by Marie Chavent. Last updated 5 years ago.
2.0 match 7 stars 6.47 score 142 scripts 2 dependentsbioc
systemPipeTools:Tools for data visualization
systemPipeTools package extends the widely used systemPipeR (SPR) workflow environment with an enhanced toolkit for data visualization, including utilities to automate the data visualizaton for analysis of differentially expressed genes (DEGs). systemPipeTools provides data transformation and data exploration functions via scatterplots, hierarchical clustering heatMaps, principal component analysis, multidimensional scaling, generalized principal components, t-Distributed Stochastic Neighbor embedding (t-SNE), and MA and volcano plots. All these utilities can be integrated with the modular design of the systemPipeR environment that allows users to easily substitute any of these features and/or custom with alternatives.
Maintained by Daniela Cassol. Last updated 5 months ago.
infrastructuredataimportsequencingqualitycontrolreportwritingexperimentaldesignclusteringdifferentialexpressionmultidimensionalscalingprincipalcomponent
3.2 match 4.00 score 4 scriptsbioc
GeDi:Defining and visualizing the distances between different genesets
The package provides different distances measurements to calculate the difference between genesets. Based on these scores the genesets are clustered and visualized as graph. This is all presented in an interactive Shiny application for easy usage.
Maintained by Annekathrin Nedwed. Last updated 5 months ago.
guigenesetenrichmentsoftwaretranscriptionrnaseqvisualizationclusteringpathwaysreportwritinggokeggreactomeshinyapps
2.3 match 1 stars 5.52 score 22 scriptsmums2
mpactr:Correction of Preprocessed MS Data
An 'R' implementation of the 'python' program Metabolomics Peak Analysis Computational Tool ('MPACT') (Robert M. Samples, Sara P. Puckett, and Marcy J. Balunas (2023) <doi:10.1021/acs.analchem.2c04632>). Filters in the package serve to address common errors in tandem mass spectrometry preprocessing, including: (1) isotopic patterns that are incorrectly split during preprocessing, (2) features present in solvent blanks due to carryover between samples, (3) features whose abundance is greater than user-defined abundance threshold in a specific group of samples, for example media blanks, (4) ions that are inconsistent between technical replicates, and (5) in-source fragment ions created during ionization before fragmentation in the tandem mass spectrometry workflow.
Maintained by Patrick Schloss. Last updated 2 days ago.
2.2 match 1 stars 5.56 score 4 scriptsbioc
made4:Multivariate analysis of microarray data using ADE4
Multivariate data analysis and graphical display of microarray data. Functions include for supervised dimension reduction (between group analysis) and joint dimension reduction of 2 datasets (coinertia analysis). It contains functions that require R package ade4.
Maintained by Aedin Culhane. Last updated 5 months ago.
clusteringclassificationdimensionreductionprincipalcomponenttranscriptomicsmultiplecomparisongeneexpressionsequencingmicroarray
2.0 match 6.11 score 107 scripts 2 dependentsbioc
ViSEAGO:ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity
The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontology (GO) analysis of complex experimental design with multiple comparisons of interest. It allows to study large-scale datasets together and visualize GO profiles to capture biological knowledge. The acronym stands for three major concepts of the analysis: Visualization, Semantic similarity and Enrichment Analysis of Gene Ontology. It provides access to the last current GO annotations, which are retrieved from one of NCBI EntrezGene, Ensembl or Uniprot databases for several species. Using available R packages and novel developments, ViSEAGO extends classical functional GO analysis to focus on functional coherence by aggregating closely related biological themes while studying multiple datasets at once. It provides both a synthetic and detailed view using interactive functionalities respecting the GO graph structure and ensuring functional coherence supplied by semantic similarity. ViSEAGO has been successfully applied on several datasets from different species with a variety of biological questions. Results can be easily shared between bioinformaticians and biologists, enhancing reporting capabilities while maintaining reproducibility.
Maintained by Aurelien Brionne. Last updated 2 months ago.
softwareannotationgogenesetenrichmentmultiplecomparisonclusteringvisualization
1.8 match 6.64 score 22 scriptsdami82
colorhcplot:Colorful Hierarchical Clustering Dendrograms
Build dendrograms with sample groups highlighted by different colors. Visualize results of hierarchical clustering analyses as dendrograms whose leaves and labels are colored according to sample grouping. Assess whether data point grouping aligns to naturally occurring clusters.
Maintained by Damiano Fantini. Last updated 7 years ago.
5.8 match 2.00 score 5 scriptscran
compositions:Compositional Data Analysis
Provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.
Maintained by K. Gerald van den Boogaart. Last updated 1 years ago.
1.8 match 1 stars 6.35 score 36 dependentsjfq3
ggordiplots:Make 'ggplot2' Versions of Vegan's Ordiplots
The 'vegan' package includes several functions for adding features to ordination plots: ordiarrows(), ordiellipse(), ordihull(), ordispider() and ordisurf(). This package adds these same features to ordination plots made with 'ggplot2'. In addition, gg_ordibubble() sizes points relative to the value of an environmental variable.
Maintained by John Quensen. Last updated 5 months ago.
1.9 match 7 stars 6.09 score 175 scriptsaroneklund
squash:Color-Based Plots for Multivariate Visualization
Functions for color-based visualization of multivariate data, i.e. colorgrams or heatmaps. Lower-level functions map numeric values to colors, display a matrix as an array of colors, and draw color keys. Higher-level plotting functions generate a bivariate histogram, a dendrogram aligned with a color-coded matrix, a triangular distance matrix, and more.
Maintained by Aron C. Eklund. Last updated 2 years ago.
2.4 match 2 stars 4.74 score 46 scripts 4 dependentsjpfitzinger
hfr:Estimate Hierarchical Feature Regression Models
Provides functions for the estimation, plotting, predicting and cross-validation of hierarchical feature regression models as described in Pfitzinger (2024). Cluster Regularization via a Hierarchical Feature Regression. Econometrics and Statistics (in press). <doi:10.1016/j.ecosta.2024.01.003>.
Maintained by Johann Pfitzinger. Last updated 1 years ago.
hierarchical-clusteringmachine-learningpenalized-regressionregularized-regression
3.8 match 1 stars 3.00 score 1 scriptsbrandmaier
pdc:Permutation Distribution Clustering
Permutation Distribution Clustering is a clustering method for time series. Dissimilarity of time series is formalized as the divergence between their permutation distributions. The permutation distribution was proposed as measure of the complexity of a time series.
Maintained by Andreas M. Brandmaier. Last updated 2 years ago.
2.0 match 6 stars 5.61 score 25 scripts 9 dependentsbioc
ChAMP:Chip Analysis Methylation Pipeline for Illumina HumanMethylation450 and EPIC
The package includes quality control metrics, a selection of normalization methods and novel methods to identify differentially methylated regions and to highlight copy number alterations.
Maintained by Yuan Tian. Last updated 5 months ago.
microarraymethylationarraynormalizationtwochannelcopynumberdnamethylation
1.7 match 6.54 score 278 scriptsgefeizhang
statVisual:Statistical Visualization Tools
Visualization functions in the applications of translational medicine (TM) and biomarker (BM) development to compare groups by statistically visualizing data and/or results of analyses, such as visualizing data by displaying in one figure different groups' histograms, boxplots, densities, scatter plots, error-bar plots, or trajectory plots, by displaying scatter plots of top principal components or dendrograms with data points colored based on group information, or visualizing volcano plots to check the results of whole genome analyses for gene differential expression.
Maintained by Wenfei Zhang. Last updated 5 years ago.
3.6 match 3.00 score 3 scriptscbhurley
gclus:Clustering Graphics
Orders panels in scatterplot matrices and parallel coordinate displays by some merit index. Package contains various indices of merit, ordering functions, and enhanced versions of pairs and parcoord which color panels according to their merit level.
Maintained by Catherine Hurley. Last updated 6 years ago.
1.3 match 8.23 score 406 scripts 82 dependentstom-wolff
ideanet:Integrating Data Exchange and Analysis for Networks ('ideanet')
A suite of convenient tools for social network analysis geared toward students, entry-level users, and non-expert practitioners. ‘ideanet’ features unique functions for the processing and measurement of sociocentric and egocentric network data. These functions automatically generate node- and system-level measures commonly used in the analysis of these types of networks. Outputs from these functions maximize the ability of novice users to employ network measurements in further analyses while making all users less prone to common data analytic errors. Additionally, ‘ideanet’ features an R Shiny graphic user interface that allows novices to explore network data with minimal need for coding.
Maintained by Tom Wolff. Last updated 2 days ago.
1.5 match 6 stars 6.80 score 10 scriptsmaximeherve
RVAideMemoire:Testing and Plotting Procedures for Biostatistics
Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.
Maintained by Maxime HERVE. Last updated 1 years ago.
1.9 match 8 stars 5.31 score 632 scriptssgs2000
ClustMC:Cluster-Based Multiple Comparisons
Multiple comparison techniques are typically applied following an F test from an ANOVA to decide which means are significantly different from one another. As an alternative to traditional methods, cluster analysis can be performed to group the means of different treatments into non-overlapping clusters. Treatments in different groups are considered statistically different. Several approaches have been proposed, with varying clustering methods and cut-off criteria. This package implements cluster-based multiple comparisons tests and also provides a visual representation in the form of a dendrogram. Di Rienzo, J. A., Guzman, A. W., & Casanoves, F. (2002) <jstor.org/stable/1400690>. Bautista, M. G., Smith, D. W., & Steiner, R. L. (1997) <doi:10.2307/1400402>.
Maintained by Santiago Garcia Sanchez. Last updated 7 months ago.
2.0 match 4.90 score 6 scriptspneuvial
adjclust:Adjacency-Constrained Clustering of a Block-Diagonal Similarity Matrix
Implements a constrained version of hierarchical agglomerative clustering, in which each observation is associated to a position, and only adjacent clusters can be merged. Typical application fields in bioinformatics include Genome-Wide Association Studies or Hi-C data analysis, where the similarity between items is a decreasing function of their genomic distance. Taking advantage of this feature, the implemented algorithm is time and memory efficient. This algorithm is described in Ambroise et al (2019) <doi:10.1186/s13015-019-0157-4>.
Maintained by Pierre Neuvial. Last updated 5 months ago.
clusteringfeatureextractiongwashi-chierarchical-clusteringlinkage-disequilibriumcppopenmp
1.3 match 16 stars 7.35 score 13 scripts 2 dependentsminoo-asty
CINNA:Deciphering Central Informative Nodes in Network Analysis
Computing, comparing, and demonstrating top informative centrality measures within a network. "CINNA: an R/CRAN package to decipher Central Informative Nodes in Network Analysis" provides a comprehensive overview of the package functionality Ashtiani et al. (2018) <doi:10.1093/bioinformatics/bty819>.
Maintained by Minoo Ashtiani. Last updated 2 years ago.
2.9 match 1 stars 3.29 score 98 scriptsbioc
scGPS:A complete analysis of single cell subpopulations, from identifying subpopulations to analysing their relationship (scGPS = single cell Global Predictions of Subpopulation)
The package implements two main algorithms to answer two key questions: a SCORE (Stable Clustering at Optimal REsolution) to find subpopulations, followed by scGPS to investigate the relationships between subpopulations.
Maintained by Quan Nguyen. Last updated 5 months ago.
singlecellclusteringdataimportsequencingcoverageopenblascpp
1.8 match 4 stars 5.20 score 7 scriptskaneplusplus
listdown:Create R Markdown from Lists
Programmatically create R Markdown documents from lists.
Maintained by Michael J. Kane. Last updated 2 years ago.
1.8 match 27 stars 5.17 score 11 scriptscmlmagneville
mFD:Compute and Illustrate the Multiple Facets of Functional Diversity
Computing functional traits-based distances between pairs of species for species gathered in assemblages allowing to build several functional spaces. The package allows to compute functional diversity indices assessing the distribution of species (and of their dominance) in a given functional space for each assemblage and the overlap between assemblages in a given functional space, see: Chao et al. (2018) <doi:10.1002/ecm.1343>, Maire et al. (2015) <doi:10.1111/geb.12299>, Mouillot et al. (2013) <doi:10.1016/j.tree.2012.10.004>, Mouillot et al. (2014) <doi:10.1073/pnas.1317625111>, Ricotta and Szeidl (2009) <doi:10.1016/j.tpb.2009.10.001>. Graphical outputs are included. Visit the 'mFD' website for more information, documentation and examples.
Maintained by Camille Magneville. Last updated 3 months ago.
1.3 match 26 stars 7.35 score 61 scriptsadrientaudiere
cati:Community Assembly by Traits: Individuals and Beyond
Detect and quantify community assembly processes using trait values of individuals or populations, the T-statistics and other metrics, and dedicated null models.
Maintained by Adrien Taudiere. Last updated 4 months ago.
1.7 match 12 stars 5.33 score 15 scriptsbioc
InterCellar:InterCellar: an R-Shiny app for interactive analysis and exploration of cell-cell communication in single-cell transcriptomics
InterCellar is implemented as an R/Bioconductor Package containing a Shiny app that allows users to interactively analyze cell-cell communication from scRNA-seq data. Starting from precomputed ligand-receptor interactions, InterCellar provides filtering options, annotations and multiple visualizations to explore clusters, genes and functions. Finally, based on functional annotation from Gene Ontology and pathway databases, InterCellar implements data-driven analyses to investigate cell-cell communication in one or multiple conditions.
Maintained by Marta Interlandi. Last updated 5 months ago.
softwaresinglecellvisualizationgotranscriptomics
1.8 match 9 stars 4.95 score 7 scriptsmi2-warsaw
sejmRP:An Information About Deputies and Votings in Polish Diet from Seventh to Eighth Term of Office
Set of functions that access information about deputies and votings in Polish diet from webpage <http://www.sejm.gov.pl>. The package was developed as a result of an internship in MI2 Group - <http://mi2.mini.pw.edu.pl>, Faculty of Mathematics and Information Science, Warsaw University of Technology.
Maintained by Piotr Smuda. Last updated 8 years ago.
1.8 match 21 stars 5.04 score 35 scriptsjarioksa
natto:An Extreme 'vegan' Package of Experimental Code
Random code that is too experimental or too weird to be included in the vegan package.
Maintained by Jari Oksanen. Last updated 28 days ago.
1.9 match 8 stars 4.68 score 1 scriptsrhenkin
visxhclust:A Shiny App for Visual Exploration of Hierarchical Clustering
A Shiny application and functions for visual exploration of hierarchical clustering with numeric datasets. Allows users to iterative set hyperparameters, select features and evaluate results through various plots and computation of evaluation criteria.
Maintained by Rafael Henkin. Last updated 2 years ago.
clusteringdata-analysisdata-sciencer-shinyshiny-apps
1.8 match 4 stars 4.86 score 12 scriptsdicook
mulgar:Functions for Pre-Processing Data for Multivariate Data Visualisation using Tours
This is a companion to the book Cook, D. and Laa, U. (2023) <https://dicook.github.io/mulgar_book/> "Interactively exploring high-dimensional data and models in R". by Cook and Laa. It contains useful functions for processing data in preparation for visualising with a tour. There are also several sample data sets.
Maintained by Dianne Cook. Last updated 2 months ago.
1.9 match 4 stars 4.50 score 79 scriptsnicolas-robette
seqhandbook:Miscellaneous Tools for Sequence Analysis
It provides miscellaneous sequence analysis functions for describing episodes in individual sequences, measuring association between domains in multidimensional sequence analysis (see Piccarreta (2017) <doi:10.1177/0049124115591013>), heat maps of sequence data, Globally Interdependent Multidimensional Sequence Analysis (see Robette et al (2015) <doi:10.1177/0081175015570976>), smoothing sequences for index plots (see Piccarreta (2012) <doi:10.1177/0049124112452394>), coding sequences for Qualitative Harmonic Analysis (see Deville (1982)), measuring stress from multidimensional scaling factors (see Piccarreta and Lior (2010) <doi:10.1111/j.1467-985X.2009.00606.x>), symmetrical (or canonical) Partial Least Squares (see Bry (1996)).
Maintained by Nicolas Robette. Last updated 2 years ago.
1.8 match 6 stars 4.76 score 19 scriptsclancylabuiuc
moRphomenses:Geometric Morphometric Tools to Align, Scale, and Compare "Shape" of Menstrual Cycle Hormones
Mitteroecker & Gunz (2009) <doi:10.1007/s11692-009-9055-x> describe how geometric morphometric methods allow researchers to quantify the size and shape of physical biological structures. We provide tools to extend geometric morphometric principles to the study of non-physical structures, hormone profiles, as outlined in Ehrlich et al (2021) <doi:10.1002/ajpa.24514>. Easily transform daily measures into multivariate landmark-based data. Includes custom functions to apply multivariate methods for data exploration as well as hypothesis testing. Also includes 'shiny' web app to streamline data exploration. Developed to study menstrual cycle hormones but functions have been generalized and should be applicable to any biomarker over any time period.
Maintained by Daniel Ehrlich. Last updated 2 months ago.
2.0 match 2 stars 4.04 score 4 scriptscstewartgh
QFASA:Quantitative Fatty Acid Signature Analysis
Accurate estimates of the diets of predators are required in many areas of ecology, but for many species current methods are imprecise, limited to the last meal, and often biased. The diversity of fatty acids and their patterns in organisms, coupled with the narrow limitations on their biosynthesis, properties of digestion in monogastric animals, and the prevalence of large storage reservoirs of lipid in many predators, led to the development of quantitative fatty acid signature analysis (QFASA) to study predator diets.
Maintained by Connie Stewart. Last updated 7 months ago.
1.7 match 1 stars 4.83 score 17 scriptsaudreyqyfu
MRPC:PC Algorithm with the Principle of Mendelian Randomization
A PC Algorithm with the Principle of Mendelian Randomization. This package implements the MRPC (PC with the principle of Mendelian randomization) algorithm to infer causal graphs. It also contains functions to simulate data under a certain topology, to visualize a graph in different ways, and to compare graphs and quantify the differences. See Badsha and Fu (2019) <doi:10.3389/fgene.2019.00460>,Badsha, Martin and Fu (2021) <doi:10.3389/fgene.2021.651812>.
Maintained by Audrey Fu. Last updated 3 years ago.
1.7 match 8 stars 4.68 score 20 scriptsandeek
protoshiny:Interactive Dendrograms for Visualizing Hierarchical Clusters with Prototypes
Shiny app to interactively visualize hierarchical clustering with prototypes. For details on hierarchical clustering with prototypes, see Bien and Tibshirani (2011) <doi:10.1198/jasa.2011.tm10183>. This package currently launches the application.
Maintained by Andee Kaplan. Last updated 3 years ago.
2.9 match 1 stars 2.70 score 4 scriptsjarioksa
twinspan:Two-Way Indicator Species Analysis
Classification of biological communities based on splitting first axis of Correspondence Analysis for the current subset of the data, and finding species that best indicate the splits. The method is particularly popular in vegetation science.
Maintained by Jari Oksanen. Last updated 4 months ago.
1.9 match 7 stars 4.10 score 18 scriptscran
sparcl:Perform Sparse Hierarchical Clustering and Sparse K-Means Clustering
Implements the sparse clustering methods of Witten and Tibshirani (2010): "A framework for feature selection in clustering"; published in Journal of the American Statistical Association 105(490): 713-726.
Maintained by Daniela Witten. Last updated 6 years ago.
1.8 match 1 stars 4.20 score 133 scripts 4 dependentsbioc
GSEAmining:Make Biological Sense of Gene Set Enrichment Analysis Outputs
Gene Set Enrichment Analysis is a very powerful and interesting computational method that allows an easy correlation between differential expressed genes and biological processes. Unfortunately, although it was designed to help researchers to interpret gene expression data it can generate huge amounts of results whose biological meaning can be difficult to interpret. Many available tools rely on the hierarchically structured Gene Ontology (GO) classification to reduce reundandcy in the results. However, due to the popularity of GSEA many more gene set collections, such as those in the Molecular Signatures Database are emerging. Since these collections are not organized as those in GO, their usage for GSEA do not always give a straightforward answer or, in other words, getting all the meaninful information can be challenging with the currently available tools. For these reasons, GSEAmining was born to be an easy tool to create reproducible reports to help researchers make biological sense of GSEA outputs. Given the results of GSEA, GSEAmining clusters the different gene sets collections based on the presence of the same genes in the leadind edge (core) subset. Leading edge subsets are those genes that contribute most to the enrichment score of each collection of genes or gene sets. For this reason, gene sets that participate in similar biological processes should share genes in common and in turn cluster together. After that, GSEAmining is able to identify and represent for each cluster: - The most enriched terms in the names of gene sets (as wordclouds) - The most enriched genes in the leading edge subsets (as bar plots). In each case, positive and negative enrichments are shown in different colors so it is easy to distinguish biological processes or genes that may be of interest in that particular study.
Maintained by Oriol Arqués. Last updated 5 months ago.
genesetenrichmentclusteringvisualization
1.9 match 4.00 score 7 scriptsbioc
cn.farms:cn.FARMS - factor analysis for copy number estimation
This package implements the cn.FARMS algorithm for copy number variation (CNV) analysis. cn.FARMS allows to analyze the most common Affymetrix (250K-SNP6.0) array types, supports high-performance computing using snow and ff.
Maintained by Andreas Mitterecker. Last updated 5 months ago.
microarraycopynumbervariationcpp
2.3 match 3.30 score 7 scriptscran
PoiClaClu:Classification and Clustering of Sequencing Data Based on a Poisson Model
Implements the methods described in the paper, Witten (2011) Classification and Clustering of Sequencing Data using a Poisson Model, Annals of Applied Statistics 5(4) 2493-2518.
Maintained by Daniela Witten. Last updated 6 years ago.
1.8 match 3.81 score 107 scripts 2 dependentspromidat
discoveR:Exploratory Data Analysis System
Performs an exploratory data analysis through a 'shiny' interface. It includes basic methods such as the mean, median, mode, normality test, among others. It also includes clustering techniques such as Principal Components Analysis, Hierarchical Clustering and the K-Means Method.
Maintained by Oldemar Rodriguez. Last updated 2 years ago.
2.3 match 3 stars 3.03 score 18 scriptsarliph
SPARTAAS:Statistical Pattern Recognition and daTing using Archaeological Artefacts assemblageS
Statistical pattern recognition and dating using archaeological artefacts assemblages. Package of statistical tools for archaeology. hclustcompro(perioclust): Bellanger Lise, Coulon Arthur, Husi Philibrary(SPARTlippe (2021, ISBN:978-3-030-60103-4). mapclust: Bellanger Lise, Coulon Arthur, Husi Philippe (2021) <doi:10.1016/j.jas.2021.105431>. seriograph: Desachy Bruno (2004) <doi:10.3406/pica.2004.2396>. cerardat: Bellanger Lise, Husi Philippe (2012) <doi:10.1016/j.jas.2011.06.031>.
Maintained by Arthur Coulon. Last updated 10 months ago.
1.6 match 6 stars 4.14 score 46 scriptsbioc
ChromHeatMap:Heat map plotting by genome coordinate
The ChromHeatMap package can be used to plot genome-wide data (e.g. expression, CGH, SNP) along each strand of a given chromosome as a heat map. The generated heat map can be used to interactively identify probes and genes of interest.
Maintained by Tim F. Rayner. Last updated 5 months ago.
1.7 match 3.30 scoresciviews
exploreit:Exploratory Data Analysis for 'SciViews::R'
Multivariate analysis and data exploration for the 'SciViews::R' dialect.
Maintained by Philippe Grosjean. Last updated 11 months ago.
multivariate-analysissciviewsstatistical-methods
1.9 match 2.70 score 4 scriptslau-mel
swamp:Visualization, Analysis and Adjustment of High-Dimensional Data in Respect to Sample Annotations
Collection of functions to connect the structure of the data with the information on the samples. Three types of associations are covered: 1. linear model of principal components. 2. hierarchical clustering analysis. 3. distribution of features-sample annotation associations. Additionally, the inter-relation between sample annotations can be analyzed. Simple methods are provided for the correction of batch effects and removal of principal components.
Maintained by Martin Lauss. Last updated 5 years ago.
1.9 match 2.42 score 29 scripts 1 dependentszdeneksulc
nomclust:Hierarchical Cluster Analysis of Nominal Data
Similarity measures for hierarchical clustering of objects characterized by nominal (categorical) variables. Evaluation criteria for nominal data clustering.
Maintained by Zdenek Sulc. Last updated 2 years ago.
1.8 match 4 stars 2.48 score 38 scriptsgavinsimpson
MRFtools:Tools for Constructing and Plotting Markov Random Fields in R for Graphical Data
Utility functions for using Markov Random Field smooths in Generalized Additive Models fitted with the 'mgcv' package.
Maintained by Eric J. Petersen. Last updated 3 days ago.
2.0 match 2.18 scorecran
heatmapFlex:Tools to Generate Flexible Heatmaps
A set of tools supporting more flexible heatmaps. The graphics is grid-like using the old graphics system. The main function is heatmap.n2(), which is a wrapper around the various functions constructing individual parts of the heatmap, like sidebars, picket plots, legends etc. The function supports zooming and splitting, i.e., having (unlimited) small heatmaps underneath each other in one plot deriving from the same data set, e.g., clustered and ordered by a supervised clustering method.
Maintained by Vidal Fey. Last updated 4 years ago.
1.8 match 2.48 score 1 dependentscran
ROKET:Optimal Transport-Based Kernel Regression
Perform optimal transport on somatic point mutations and kernel regression hypothesis testing by integrating pathway level similarities at the gene level (Little et al. (2023) <doi:10.1111/biom.13769>). The software implements balanced and unbalanced optimal transport and omnibus tests with 'C++' across a set of tumor samples and allows for multi-threading to decrease computational runtime.
Maintained by Paul Little. Last updated 10 days ago.
2.0 match 2.00 scoremmaechler
VLMC:Variable Length Markov Chains ('VLMC') Models
Functions, Classes & Methods for estimation, prediction, and simulation (bootstrap) of Variable Length Markov Chain ('VLMC') Models.
Maintained by Martin Maechler. Last updated 7 months ago.
2.0 match 1.92 score 28 scriptsblansche
fdm2id:Data Mining and R Programming for Beginners
Contains functions to simplify the use of data mining methods (classification, regression, clustering, etc.), for students and beginners in R programming. Various R packages are used and wrappers are built around the main functions, to standardize the use of data mining methods (input/output): it brings a certain loss of flexibility, but also a gain of simplicity. The package name came from the French "Fouille de Données en Master 2 Informatique Décisionnelle".
Maintained by Alexandre Blansché. Last updated 2 years ago.
2.3 match 1 stars 1.62 score 42 scriptscran
RJSplot:Interactive Graphs with R
Creates interactive graphs with 'R'. It joins the data analysis power of R and the visualization libraries of JavaScript in one package.
Maintained by Carlos Prieto. Last updated 3 years ago.
2.3 match 4 stars 1.60 scoresynergisticcauselearning
CoOL:Causes of Outcome Learning
Implementing the computational phase of the Causes of Outcome Learning approach as described in Rieckmann, Dworzynski, Arras, Lapuschkin, Samek, Arah, Rod, Ekstrom. 2022. Causes of outcome learning: A causal inference-inspired machine learning approach to disentangling common combinations of potential causes of a health outcome. International Journal of Epidemiology <doi:10.1093/ije/dyac078>. The optional 'ggtree' package can be obtained through Bioconductor.
Maintained by Andreas Rieckmann. Last updated 3 years ago.
2.0 match 1.70 score 6 scriptsskranz
distRforest:Distribution-based Random Forest
Extension of the rpart package with added loss functions and random forest functionality.
Maintained by Roel Henckaerts. Last updated 5 years ago.
1.9 match 1.78 score 12 scriptsthezetner
Plasmidprofiler:Visualization of Plasmid Profile Results
Contains functions developed to combine the results of querying a plasmid database using short-read sequence typing with the results of a blast analysis against the query results.
Maintained by Adrian Zetner. Last updated 8 years ago.
1.8 match 1.43 score 27 scriptstabea17
graphclust:Hierarchical Graph Clustering for a Collection of Networks
Graph clustering using an agglomerative algorithm to maximize the integrated classification likelihood criterion and a mixture of stochastic block models. The method is described in the article "Model-based clustering of multiple networks with a hierarchical algorithm" by T. Rebafka (2022) <arXiv:2211.02314>.
Maintained by Tabea Rebafka. Last updated 2 years ago.
1.7 match 1.43 score 27 scriptsplangfelder
moduleColor:Basic Module Functions
Methods for color labeling, calculation of eigengenes, merging of closely related modules.
Maintained by Peter Langfelder. Last updated 3 years ago.
1.9 match 1.28 score 19 scriptsleondap
recluster:Ordination Methods for the Analysis of Beta-Diversity Indices
The analysis of different aspects of biodiversity requires specific algorithms. For example, in regionalisation analyses, the high frequency of ties and zero values in dissimilarity matrices produced by Beta-diversity turnover produces hierarchical cluster dendrograms whose topology and bootstrap supports are affected by the order of rows in the original matrix. Moreover, visualisation of biogeographical regionalisation can be facilitated by a combination of hierarchical clustering and multi-dimensional scaling. The recluster package provides robust techniques to visualise and analyse pattern of biodiversity and to improve occurrence data for cryptic taxa.
Maintained by Leonardo Dapporto. Last updated 4 months ago.
0.5 match 4 stars 4.69 score 41 scriptsroger0268
octopucs:Statistical Support for Hierarchical Clusters
Generates n hierarchical clustering hypotheses on subsets of classifiers (usually species in community ecology studies). The n clustering hypotheses are combined to generate a generalized cluster, and computes three metrics of support. 1) The average proportion of elements conforming the group in each of the n clusters (integrity). And 2) the contamination, i.e., the average proportion of elements from other groups that enter a focal group. 3) The probability of existence of the group gives the integrity and contamination in a Bayesian approach.
Maintained by Roger Guevara. Last updated 7 months ago.
1.8 match 1.30 scorebioc
omicplotR:Visual Exploration of Omic Datasets Using a Shiny App
A Shiny app for visual exploration of omic datasets as compositions, and differential abundance analysis using ALDEx2. Useful for exploring RNA-seq, meta-RNA-seq, 16s rRNA gene sequencing with visualizations such as principal component analysis biplots (coloured using metadata for visualizing each variable), dendrograms and stacked bar plots, and effect plots (ALDEx2). Input is a table of counts and metadata file (if metadata exists), with options to filter data by count or by metadata to remove low counts, or to visualize select samples according to selected metadata.
Maintained by Daniel Giguere. Last updated 5 months ago.
softwaredifferentialexpressiongeneexpressionguirnaseqdnaseqmetagenomicstranscriptomicsbayesianmicrobiomevisualizationsequencingimmunooncology
0.5 match 4.00 score 5 scriptsashipunov
shipunov:Miscellaneous Functions from Alexey Shipunov
A collection of functions for data manipulation, plotting and statistical computing, to use separately or with the book "Visual Statistics. Use R!": Shipunov (2020) <http://ashipunov.info/shipunov/software/r/r-en.htm>. Dr Alexey Shipunov died in December 2022. Most useful functions: Bclust(), Jclust() and BootA() which bootstrap hierarchical clustering; Recode() which does multiple recoding in a fast, simple and flexible way; Misclass() which outputs confusion matrix even if classes are not concerted; Overlap() which measures group separation on any projection; Biarrows() which converts any scatterplot into biplot; and Pleiad() which is fast and flexible correlogram.
Maintained by ORPHANED. Last updated 2 years ago.
1.9 match 1.00 score 9 scriptsjeffjetton
greenclust:Combine Categories Using Greenacre's Method
Implements a method of iteratively collapsing the rows of a contingency table, two at a time, by selecting the pair of categories whose combination yields a new table with the smallest loss of chi-squared, as described by Greenacre, M.J. (1988) <doi:10.1007/BF01901670>. The result is compatible with the class of object returned by the 'stats' package's hclust() function and can be used similarly (plotted as a dendrogram, cut, etc.). Additional functions are provided for automatic cutting and diagnostic plotting.
Maintained by Jeff Jetton. Last updated 1 years ago.
0.5 match 5 stars 3.40 score 8 scriptsrituroy
heatmap4:Simple Heatmap Function
A color image of a numerical matrix. A dendrogram can be added to the left side and to the top. This package takes the original heatmap function and reduces the argument complexity.
Maintained by Ritu Roy. Last updated 1 months ago.
0.5 match 3.00 score 4 scriptsalbyfs
mdendro:Extended Agglomerative Hierarchical Clustering
A comprehensive collection of linkage methods for agglomerative hierarchical clustering on a matrix of proximity data (distances or similarities), returning a multifurcated dendrogram or multidendrogram. Multidendrograms can group more than two clusters when ties in proximity data occur, and therefore they do not depend on the order of the input data. Descriptive measures to analyze the resulting dendrogram are additionally provided.
Maintained by Alberto Fernandez. Last updated 1 years ago.
0.8 match 2.00 score 6 scriptscran
MultivariateAnalysis:Pacote Para Analise Multivariada
Package with multivariate analysis methodologies for experiment evaluation. The package estimates dissimilarity measures, builds dendrograms, obtains MANOVA, principal components, canonical variables, etc. (Pacote com metodologias de analise multivariada para avaliação de experimentos. O pacote estima medidas de dissimilaridade, construi de dendogramas, obtem a MANOVA, componentes principais, variaveis canonicas, etc.)
Maintained by Alcinei Mistico Azevedo. Last updated 11 months ago.
0.5 match 2.95 scorebioc
clustComp:Clustering Comparison Package
clustComp is a package that implements several techniques for the comparison and visualisation of relationships between different clustering results, either flat versus flat or hierarchical versus flat. These relationships among clusters are displayed using a weighted bi-graph, in which the nodes represent the clusters and the edges connect pairs of nodes with non-empty intersection; the weight of each edge is the number of elements in that intersection and is displayed through the edge thickness. The best layout of the bi-graph is provided by the barycentre algorithm, which minimises the weighted number of crossings. In the case of comparing a hierarchical and a non-hierarchical clustering, the dendrogram is pruned at different heights, selected by exploring the tree by depth-first search, starting at the root. Branches are decided to be split according to the value of a scoring function, that can be based either on the aesthetics of the bi-graph or on the mutual information between the hierarchical and the flat clusterings. A mapping between groups of clusters from each side is constructed with a greedy algorithm, and can be additionally visualised.
Maintained by Aurora Torrente. Last updated 5 months ago.
geneexpressionclusteringvisualization
0.5 match 2.60 score 1 scriptsdaniel-jg
paintmap:Plotting Paintmaps
Plots matrices of colours as grids of coloured squares - aka heatmaps, guaranteeing legible row and column names, without transformation of values, without re-ordering rows or columns, and without dendrograms.
Maintained by Daniel Greene. Last updated 9 years ago.
0.5 match 2.26 score 6 scripts 6 dependentsjoshageman
cmAnalysis:Process and Visualise Concept Mapping Data
Processing and visualizing concept mapping data. Concept maps are versatile tools used across disciplines to enhance understanding, teaching, brainstorming, and information organization. The analysis of concept mapping data involves the sequential use of cluster analysis (for sorting participants and statements), multidimensional scaling (for positioning statements in a conceptual space), and visualization techniques, including point cluster maps and dendrograms.
Maintained by Jos Hageman. Last updated 3 days ago.
0.5 match 1.00 score