Showing 70 of total 70 results (show query)
dicook
nullabor:Tools for Graphical Inference
Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.
Maintained by Di Cook. Last updated 2 months ago.
57 stars 10.38 score 370 scripts 2 dependentsegeulgen
pathfindR:Enrichment Analysis Utilizing Active Subnetworks
Enrichment analysis enables researchers to uncover mechanisms underlying a phenotype. However, conventional methods for enrichment analysis do not take into account protein-protein interaction information, resulting in incomplete conclusions. 'pathfindR' is a tool for enrichment analysis utilizing active subnetworks. The main function identifies active subnetworks in a protein-protein interaction network using a user-provided list of genes and associated p values. It then performs enrichment analyses on the identified subnetworks, identifying enriched terms (i.e. pathways or, more broadly, gene sets) that possibly underlie the phenotype of interest. 'pathfindR' also offers functionalities to cluster the enriched terms and identify representative terms in each cluster, to score the enriched terms per sample and to visualize analysis results. The enrichment, clustering and other methods implemented in 'pathfindR' are described in detail in Ulgen E, Ozisik O, Sezerman OU. 2019. 'pathfindR': An R Package for Comprehensive Identification of Enriched Pathways in Omics Data Through Active Subnetworks. Front. Genet. <doi:10.3389/fgene.2019.00858>.
Maintained by Ege Ulgen. Last updated 2 months ago.
active-subnetworksenrichmentpathwaypathway-enrichment-analysissubnetwork
187 stars 10.38 score 138 scriptsbioc
pRoloc:A unifying bioinformatics framework for spatial proteomics
The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.
Maintained by Lisa Breckels. Last updated 7 days ago.
immunooncologyproteomicsmassspectrometryclassificationclusteringqualitycontrolbioconductorproteomics-dataspatial-proteomicsvisualisationopenblascpp
15 stars 10.31 score 101 scripts 2 dependentsmhahsler
stream:Infrastructure for Data Stream Mining
A framework for data stream modeling and associated data mining tasks such as clustering and classification. The development of this package was supported in part by NSF IIS-0948893, NSF CMMI 1728612, and NIH R21HG005912. Hahsler et al (2017) <doi:10.18637/jss.v076.i14>.
Maintained by Michael Hahsler. Last updated 22 days ago.
data-stream-clusteringdatastreamstream-miningcpp
39 stars 10.05 score 132 scripts 3 dependentsimmunomind
immunarch:Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires
A comprehensive framework for bioinformatics exploratory analysis of bulk and single-cell T-cell receptor and antibody repertoires. It provides seamless data loading, analysis and visualisation for AIRR (Adaptive Immune Receptor Repertoire) data, both bulk immunosequencing (RepSeq) and single-cell sequencing (scRNAseq). Immunarch implements most of the widely used AIRR analysis methods, such as: clonality analysis, estimation of repertoire similarities in distribution of clonotypes and gene segments, repertoire diversity analysis, annotation of clonotypes using external immune receptor databases and clonotype tracking in vaccination and cancer studies. A successor to our previously published 'tcR' immunoinformatics package (Nazarov 2015) <doi:10.1186/s12859-015-0613-1>.
Maintained by Vadim I. Nazarov. Last updated 1 years ago.
airr-analysisb-cell-receptorbcrbcr-repertoirebioinformaticsigig-repertoireimmune-repertoireimmune-repertoire-analysisimmune-repertoire-dataimmunoglobulinimmunoinformaticsimmunologyrep-seqrepertoire-analysissingle-cellsingle-cell-analysist-cell-receptortcrtcr-repertoirecpp
316 stars 9.49 score 203 scriptsbioc
scone:Single Cell Overview of Normalized Expression data
SCONE is an R package for comparing and ranking the performance of different normalization schemes for single-cell RNA-seq and other high-throughput analyses.
Maintained by Davide Risso. Last updated 1 months ago.
immunooncologynormalizationpreprocessingqualitycontrolgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellcoverage
53 stars 9.00 score 104 scriptsmlr-org
mlr3verse:Easily Install and Load the 'mlr3' Package Family
The 'mlr3' package family is a set of packages for machine-learning purposes built in a modular fashion. This wrapper package is aimed to simplify the installation and loading of the core 'mlr3' packages. Get more information about the 'mlr3' project at <https://mlr3book.mlr-org.com/>.
Maintained by Marc Becker. Last updated 3 months ago.
55 stars 8.32 score 720 scripts 1 dependentsmlr-org
mlr3cluster:Cluster Extension for 'mlr3'
Extends the 'mlr3' package with cluster analysis.
Maintained by Maximilian Mücke. Last updated 2 months ago.
cluster-analysisclusteringmlr3
23 stars 8.31 score 50 scripts 2 dependentsbioc
MLInterfaces:Uniform interfaces to R machine learning procedures for data in Bioconductor containers
This package provides uniform interfaces to machine learning code for data in R and Bioconductor containers.
Maintained by Vincent Carey. Last updated 5 months ago.
7.63 score 79 scripts 6 dependentsmkossmeier
metaviz:Forest Plots, Funnel Plots, and Visual Funnel Plot Inference for Meta-Analysis
A compilation of functions to create visually appealing and information-rich plots of meta-analytic data using 'ggplot2'. Currently allows to create forest plots, funnel plots, and many of their variants, such as rainforest plots, thick forest plots, additional evidence contour funnel plots, and sunset funnel plots. In addition, functionalities for visual inference with the funnel plot in the context of meta-analysis are provided.
Maintained by Michael Kossmeier. Last updated 5 years ago.
17 stars 7.32 score 135 scriptsbioc
pRolocGUI:Interactive visualisation of spatial proteomics data
The package pRolocGUI comprises functions to interactively visualise spatial proteomics data on the basis of pRoloc, pRolocdata and shiny.
Maintained by Lisa Breckels. Last updated 5 months ago.
8 stars 6.90 score 3 scriptssongw01
MEGENA:Multiscale Clustering of Geometrical Network
Co-Expression Network Analysis by adopting network embedding technique. Song W.-M., Zhang B. (2015) Multiscale Embedded Gene Co-expression Network Analysis. PLoS Comput Biol 11(11): e1004574. <doi: 10.1371/journal.pcbi.1004574>.
Maintained by Won-Min Song. Last updated 1 years ago.
51 stars 6.84 score 45 scripts 1 dependentsbioc
mnem:Mixture Nested Effects Models
Mixture Nested Effects Models (mnem) is an extension of Nested Effects Models and allows for the analysis of single cell perturbation data provided by methods like Perturb-Seq (Dixit et al., 2016) or Crop-Seq (Datlinger et al., 2017). In those experiments each of many cells is perturbed by a knock-down of a specific gene, i.e. several cells are perturbed by a knock-down of gene A, several by a knock-down of gene B, ... and so forth. The observed read-out has to be multi-trait and in the case of the Perturb-/Crop-Seq gene are expression profiles for each cell. mnem uses a mixture model to simultaneously cluster the cell population into k clusters and and infer k networks causally linking the perturbed genes for each cluster. The mixture components are inferred via an expectation maximization algorithm.
Maintained by Martin Pirkl. Last updated 8 days ago.
pathwayssystemsbiologynetworkinferencenetworkrnaseqpooledscreenssinglecellcrispratacseqdnaseqgeneexpressioncpp
4 stars 6.81 score 15 scripts 4 dependentsbioc
Linnorm:Linear model and normality based normalization and transformation method (Linnorm)
Linnorm is an algorithm for normalizing and transforming RNA-seq, single cell RNA-seq, ChIP-seq count data or any large scale count data. It has been independently reviewed by Tian et al. on Nature Methods (https://doi.org/10.1038/s41592-019-0425-8). Linnorm can work with raw count, CPM, RPKM, FPKM and TPM.
Maintained by Shun Hang Yip. Last updated 5 months ago.
immunooncologysequencingchipseqrnaseqdifferentialexpressiongeneexpressiongeneticsnormalizationsoftwaretranscriptionbatcheffectpeakdetectionclusteringnetworksinglecellcpp
6.26 score 61 scripts 5 dependentscapnrefsmmat
regressinator:Simulate and Diagnose (Generalized) Linear Models
Simulate samples from populations with known covariate distributions, generate response variables according to common linear and generalized linear model families, draw from sampling distributions of regression estimates, and perform visual inference on diagnostics from model fits.
Maintained by Alex Reinhart. Last updated 6 months ago.
4 stars 6.08 score 25 scriptszhenkewu
baker:"Nested Partially Latent Class Models"
Provides functions to specify, fit and visualize nested partially-latent class models ( Wu, Deloria-Knoll, Hammitt, and Zeger (2016) <doi:10.1111/rssc.12101>; Wu, Deloria-Knoll, and Zeger (2017) <doi:10.1093/biostatistics/kxw037>; Wu and Chen (2021) <doi:10.1002/sim.8804>) for inference of population disease etiology and individual diagnosis. In the motivating Pneumonia Etiology Research for Child Health (PERCH) study, because both quantities of interest sum to one hundred percent, the PERCH scientists frequently refer to them as population etiology pie and individual etiology pie, hence the name of the package.
Maintained by Zhenke Wu. Last updated 11 months ago.
bayesiancase-controllatent-class-analysisjagscpp
8 stars 6.00 score 21 scriptsmhahsler
streamMOA:Interface for MOA Stream Clustering Algorithms
Interface for data stream clustering algorithms implemented in the MOA (Massive Online Analysis) framework (Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer (2010). MOA: Massive Online Analysis, Journal of Machine Learning Research 11: 1601-1604).
Maintained by Michael Hahsler. Last updated 7 months ago.
clusteringdataminingdatastreamopenjdk
13 stars 5.98 score 37 scriptsbioc
epiNEM:epiNEM
epiNEM is an extension of the original Nested Effects Models (NEM). EpiNEM is able to take into account double knockouts and infer more complex network signalling pathways. It is tailored towards large scale double knock-out screens.
Maintained by Martin Pirkl. Last updated 5 months ago.
pathwayssystemsbiologynetworkinferencenetwork
1 stars 5.83 score 1 scripts 3 dependentsbioc
bandle:An R package for the Bayesian analysis of differential subcellular localisation experiments
The Bandle package enables the analysis and visualisation of differential localisation experiments using mass-spectrometry data. Experimental methods supported include dynamic LOPIT-DC, hyperLOPIT, Dynamic Organellar Maps, Dynamic PCP. It provides Bioconductor infrastructure to analyse these data.
Maintained by Oliver M. Crook. Last updated 21 hours ago.
bayesianclassificationclusteringimmunooncologyqualitycontroldataimportproteomicsmassspectrometryopenblascppopenmp
4 stars 5.68 score 3 scriptsustervbo
beadplexr:Analysis of Multiplex Cytometric Bead Assays
Reproducible and automated analysis of multiplex bead assays such as CBA (Morgan et al. 2004; <doi: 10.1016/j.clim.2003.11.017>), LEGENDplex (Yu et al. 2015; <doi: 10.1084/jem.20142318>), and MACSPlex (Miltenyi Biotec 2014; Application note: Data acquisition and analysis without the MACSQuant analyzer; <https://www.miltenyibiotec.com/upload/assets/IM0021608.PDF>). The package provides functions for streamlined reading of fcs files, and identification of bead clusters and analyte expression. The package eases the calculation of standard curves and the subsequent calculation of the analyte concentration.
Maintained by Ulrik Stervbo. Last updated 2 years ago.
5.07 score 39 scriptsmartinloza
Canek:Batch Correction of Single Cell Transcriptome Data
Non-linear/linear hybrid method for batch-effect correction that uses Mutual Nearest Neighbors (MNNs) to identify similar cells between datasets. Reference: Loza M. et al. (NAR Genomics and Bioinformatics, 2020) <doi:10.1093/nargab/lqac022>.
Maintained by Martin Loza. Last updated 1 years ago.
batch-effectsbioinformaticssingle-cell-rna-seqtranscriptomics
5 stars 5.06 score 23 scriptsacabassi
coca:Cluster-of-Clusters Analysis
Contains the R functions needed to perform Cluster-Of-Clusters Analysis (COCA) and Consensus Clustering (CC). For further details please see Cabassi and Kirk (2020) <doi:10.1093/bioinformatics/btaa593>.
Maintained by Alessandra Cabassi. Last updated 5 years ago.
cluster-analysiscluster-of-clustersclusteringcocagenomicsintegrative-clusteringmulti-omics
6 stars 5.03 score 12 scripts 1 dependentsbioc
evaluomeR:Evaluation of Bioinformatics Metrics
Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.
Maintained by José Antonio Bernabé-Díaz. Last updated 5 months ago.
clusteringclassificationfeatureextractionassessmentclustering-evaluationevaluomeevaluomermetrics
4.82 score 33 scriptsdgrun
RaceID:Identification of Cell Types, Inference of Lineage Trees, and Prediction of Noise Dynamics from Single-Cell RNA-Seq Data
Application of 'RaceID' allows inference of cell types and prediction of lineage trees by the 'StemID2' algorithm (Herman, J.S., Sagar, Grun D. (2018) <DOI:10.1038/nmeth.4662>). 'VarID2' is part of this package and allows quantification of biological gene expression noise at single-cell resolution (Rosales-Alvarez, R.E., Rettkowski, J., Herman, J.S., Dumbovic, G., Cabezas-Wallscheid, N., Grun, D. (2023) <DOI:10.1186/s13059-023-02974-1>).
Maintained by Dominic Grün. Last updated 4 months ago.
4.74 score 110 scriptsbioc
FuseSOM:A Correlation Based Multiview Self Organizing Maps Clustering For IMC Datasets
A correlation-based multiview self-organizing map for the characterization of cell types in highly multiplexed in situ imaging cytometry assays (`FuseSOM`) is a tool for unsupervised clustering. `FuseSOM` is robust and achieves high accuracy by combining a `Self Organizing Map` architecture and a `Multiview` integration of correlation based metrics. This allows FuseSOM to cluster highly multiplexed in situ imaging cytometry assays.
Maintained by Elijah Willie. Last updated 5 months ago.
singlecellcellbasedassaysclusteringspatial
1 stars 4.71 score 17 scriptsbioc
nempi:Inferring unobserved perturbations from gene expression data
Takes as input an incomplete perturbation profile and differential gene expression in log odds and infers unobserved perturbations and augments observed ones. The inference is done by iteratively inferring a network from the perturbations and inferring perturbations from the network. The network inference is done by Nested Effects Models.
Maintained by Martin Pirkl. Last updated 5 months ago.
softwaregeneexpressiondifferentialexpressiondifferentialmethylationgenesignalingpathwaysnetworkclassificationneuralnetworknetworkinferenceatacseqdnaseqrnaseqpooledscreenscrisprsinglecellsystemsbiology
2 stars 4.60 score 2 scriptsbioc
bnem:Training of logical models from indirect measurements of perturbation experiments
bnem combines the use of indirect measurements of Nested Effects Models (package mnem) with the Boolean networks of CellNOptR. Perturbation experiments of signalling nodes in cells are analysed for their effect on the global gene expression profile. Those profiles give evidence for the Boolean regulation of down-stream nodes in the network, e.g., whether two parents activate their child independently (OR-gate) or jointly (AND-gate).
Maintained by Martin Pirkl. Last updated 5 months ago.
pathwayssystemsbiologynetworkinferencenetworkgeneexpressiongeneregulationpreprocessing
2 stars 4.60 score 5 scriptsbioc
dce:Pathway Enrichment Based on Differential Causal Effects
Compute differential causal effects (dce) on (biological) networks. Given observational samples from a control experiment and non-control (e.g., cancer) for two genes A and B, we can compute differential causal effects with a (generalized) linear regression. If the causal effect of gene A on gene B in the control samples is different from the causal effect in the non-control samples the dce will differ from zero. We regularize the dce computation by the inclusion of prior network information from pathway databases such as KEGG.
Maintained by Kim Philipp Jablonski. Last updated 4 months ago.
softwarestatisticalmethodgraphandnetworkregressiongeneexpressiondifferentialexpressionnetworkenrichmentnetworkkeggbioconductorcausality
13 stars 4.59 score 4 scriptsbioc
seqArchR:Identify Different Architectures of Sequence Elements
seqArchR enables unsupervised discovery of _de novo_ clusters with characteristic sequence architectures characterized by position-specific motifs or composition of stretches of nucleotides, e.g., CG-richness. seqArchR does _not_ require any specifications w.r.t. the number of clusters, the length of any individual motifs, or the distance between motifs if and when they occur in pairs/groups; it directly detects them from the data. seqArchR uses non-negative matrix factorization (NMF) as its backbone, and employs a chunking-based iterative procedure that enables processing of large sequence collections efficiently. Wrapper functions are provided for visualizing cluster architectures as sequence logos.
Maintained by Sarvesh Nikumbh. Last updated 5 months ago.
motifdiscoverygeneregulationmathematicalbiologysystemsbiologytranscriptomicsgeneticsclusteringdimensionreductionfeatureextractiondnaseqnmfnonnegative-matrix-factorizationpromoter-sequence-architecturesscikit-learnsequence-analysissequence-architecturesunsupervised-machine-learning
1 stars 4.48 score 9 scripts 1 dependentsbioc
MMUPHin:Meta-analysis Methods with Uniform Pipeline for Heterogeneity in Microbiome Studies
MMUPHin is an R package for meta-analysis tasks of microbiome cohorts. It has function interfaces for: a) covariate-controlled batch- and cohort effect adjustment, b) meta-analysis differential abundance testing, c) meta-analysis unsupervised discrete structure (clustering) discovery, and d) meta-analysis unsupervised continuous structure discovery.
Maintained by Siyuan MA. Last updated 5 months ago.
metagenomicsmicrobiomebatcheffect
4.44 score 46 scriptsacabassi
klic:Kernel Learning Integrative Clustering
Kernel Learning Integrative Clustering (KLIC) is an algorithm that allows to combine multiple kernels, each representing a different measure of the similarity between a set of observations. The contribution of each kernel on the final clustering is weighted according to the amount of information carried by it. As well as providing the functions required to perform the kernel-based clustering, this package also allows the user to simply give the data as input: the kernels are then built using consensus clustering. Different strategies to choose the best number of clusters are also available. For further details please see Cabassi and Kirk (2020) <doi:10.1093/bioinformatics/btaa593>.
Maintained by Alessandra Cabassi. Last updated 5 years ago.
cluster-analysisclusteringcocagenomicsintegrative-clusteringkernel-methodsmulti-omics
5 stars 4.40 score 10 scriptsocbe-uio
DIscBIO:A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
An open, multi-algorithmic pipeline for easy, fast and efficient analysis of cellular sub-populations and the molecular signatures that characterize them. The pipeline consists of four successive steps: data pre-processing, cellular clustering with pseudo-temporal ordering, defining differential expressed genes and biomarker identification. More details on Ghannoum et. al. (2021) <doi:10.3390/ijms22031399>. This package implements extensions of the work published by Ghannoum et. al. (2019) <doi:10.1101/700989>.
Maintained by Waldir Leoncio. Last updated 1 years ago.
biomarker-discoveryjupyter-notebookscrna-seqsingle-cell-analysistranscriptomicsopenjdk
12 stars 4.38 score 5 scriptsjafarilab
NIMAA:Nominal Data Mining Analysis
Functions for nominal data mining based on bipartite graphs, which build a pipeline for analysis and missing values imputation. Methods are mainly from the paper: Jafari, Mohieddin, et al. (2021) <doi:10.1101/2021.03.18.436040>, some new ones are also included.
Maintained by Mohieddin Jafari. Last updated 2 years ago.
4 stars 4.30 score 7 scriptsarliph
SPARTAAS:Statistical Pattern Recognition and daTing using Archaeological Artefacts assemblageS
Statistical pattern recognition and dating using archaeological artefacts assemblages. Package of statistical tools for archaeology. hclustcompro(perioclust): Bellanger Lise, Coulon Arthur, Husi Philibrary(SPARTlippe (2021, ISBN:978-3-030-60103-4). mapclust: Bellanger Lise, Coulon Arthur, Husi Philippe (2021) <doi:10.1016/j.jas.2021.105431>. seriograph: Desachy Bruno (2004) <doi:10.3406/pica.2004.2396>. cerardat: Bellanger Lise, Husi Philippe (2012) <doi:10.1016/j.jas.2011.06.031>.
Maintained by Arthur Coulon. Last updated 11 months ago.
6 stars 4.14 score 46 scriptsmhahsler
streamConnect:Connecting Stream Mining Components Using Sockets and Web Services
Adds functionality to connect stream mining components from package stream using sockets and Web services. The package can be used create distributed workflows and create plumber-based Web services which can be deployed on most common cloud services.
Maintained by Michael Hahsler. Last updated 7 months ago.
4 stars 4.08 score 1 scriptsberndbischl
tspmeta:Instance Feature Calculation and Evolutionary Instance Generation for the Traveling Salesman Problem
Instance feature calculation and evolutionary instance generation for the traveling salesman problem. Also contains code to "morph" two TSP instances into each other. And the possibility to conveniently run a couple of solvers on TSP instances.
Maintained by Bernd Bischl. Last updated 9 years ago.
5 stars 4.08 score 24 scriptsbioc
seqArchRplus:Downstream analyses of promoter sequence architectures and HTML report generation
seqArchRplus facilitates downstream analyses of promoter sequence architectures/clusters identified by seqArchR (or any other tool/method). With additional available information such as the TPM values and interquantile widths (IQWs) of the CAGE tag clusters, seqArchRplus can order the input promoter clusters by their shape (IQWs), and write the cluster information as browser/IGV track files. Provided visualizations are of two kind: per sample/stage and per cluster visualizations. Those of the first kind include: plot panels for each sample showing per cluster shape, TPM and other score distributions, sequence logos, and peak annotations. The second include per cluster chromosome-wise and strand distributions, motif occurrence heatmaps and GO term enrichments. Additionally, seqArchRplus can also generate HTML reports for easy viewing and comparison of promoter architectures between samples/stages.
Maintained by Sarvesh Nikumbh. Last updated 5 months ago.
annotationvisualizationreportwritinggomotifannotationclustering
1 stars 4.00 score 2 scriptsbioc
adductomicsR:Processing of adductomic mass spectral datasets
Processes MS2 data to identify potentially adducted peptides from spectra that has been corrected for mass drift and retention time drift and quantifies MS1 level mass spectral peaks.
Maintained by Josie Hayes. Last updated 5 months ago.
massspectrometrymetabolomicssoftwarethirdpartyclientdataimportgui
1 stars 4.00 score 5 scriptsbioc
OMICsPCA:An R package for quantitative integration and analysis of multiple omics assays from heterogeneous samples
OMICsPCA is an analysis pipeline designed to integrate multi OMICs experiments done on various subjects (e.g. Cell lines, individuals), treatments (e.g. disease/control) or time points and to analyse such integrated data from various various angles and perspectives. In it's core OMICsPCA uses Principal Component Analysis (PCA) to integrate multiomics experiments from various sources and thus has ability to over data insufficiency issues by using the ingegrated data as representatives. OMICsPCA can be used in various application including analysis of overall distribution of OMICs assays across various samples /individuals /time points; grouping assays by user-defined conditions; identification of source of variation, similarity/dissimilarity between assays, variables or individuals.
Maintained by Subhadeep Das. Last updated 5 months ago.
immunooncologymultiplecomparisonprincipalcomponentdatarepresentationworkflowvisualizationdimensionreductionclusteringbiologicalquestionepigeneticsworkflowtranscriptiongeneticvariabilityguibiomedicalinformaticsepigeneticsfunctionalgenomicssinglecell
4.00 score 1 scriptsmhahsler
rEMM:Extensible Markov Model for Modelling Temporal Relationships Between Clusters
Implements TRACDS (Temporal Relationships between Clusters for Data Streams), a generalization of Extensible Markov Model (EMM). TRACDS adds a temporal or order model to data stream clustering by superimposing a dynamically adapting Markov Chain. Also provides an implementation of EMM (TRACDS on top of tNN data stream clustering). Development of this package was supported in part by NSF IIS-0948893 and R21HG005912 from the National Human Genome Research Institute. Hahsler and Dunham (2010) <doi:10.18637/jss.v035.i05>.
Maintained by Michael Hahsler. Last updated 7 months ago.
clusteringdata-streamsequence-analysis
2 stars 3.79 score 31 scriptsgmcmacran
dann:Discriminant Adaptive Nearest Neighbor Classification
Discriminant Adaptive Nearest Neighbor Classification is a variation of k nearest neighbors where the shape of the neighborhood is data driven. This package implements dann and sub_dann from Hastie (1996) <https://web.stanford.edu/~hastie/Papers/dann_IEEE.pdf>.
Maintained by Greg McMahan. Last updated 8 months ago.
3.74 score 37 scriptsbioc
ClustAll:ClustAll: Data driven strategy to robustly identify stratification of patients within complex diseases
Data driven strategy to find hidden groups of patients with complex diseases using clinical data. ClustAll facilitates the unsupervised identification of multiple robust stratifications. ClustAll, is able to overcome the most common limitations found when dealing with clinical data (missing values, correlated data, mixed data types).
Maintained by Asier Ortega-Legarreta. Last updated 5 months ago.
softwarestatisticalmethodclusteringdimensionreductionprincipalcomponent
3.70 score 1 scriptsnatesmith07
truh:An R package for Two-Sample Nonparametric Testing Under Heterogeneity
This R package implements the TRUH test statistic for two sample testing under heterogeneity. TRUH incorporates the underlying heterogeneity and imbalance in the samples, and provides a conservative test for the composite null hypothesis that the two samples arise from the same mixture distribution but may differ with respect to the mixing weights. See Trambak Banerjee, Bhaswar B. Bhattacharya, Gourab Mukherjee Ann. Appl. Stat. 14(4): 1777-1805 (December 2020). <DOI: 10.1214/20-AOAS1362> for more details.
Maintained by Nathan Smith. Last updated 4 years ago.
3.70 score 6 scriptsbioc
omada:Machine learning tools for automated transcriptome clustering analysis
Symptomatic heterogeneity in complex diseases reveals differences in molecular states that need to be investigated. However, selecting the numerous parameters of an exploratory clustering analysis in RNA profiling studies requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent and further gene association analyses need to be performed independently. We have developed a suite of tools to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning based functions. The efficiency of each tool was tested with four datasets characterised by different expression signal strengths. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Even in datasets with less clear biological distinctions, stable subgroups with different expression profiles and clinical associations were found.
Maintained by Sokratis Kariotis. Last updated 5 months ago.
softwareclusteringrnaseqgeneexpression
3.60 score 5 scriptsbioc
uSORT:uSORT: A self-refining ordering pipeline for gene selection
This package is designed to uncover the intrinsic cell progression path from single-cell RNA-seq data. It incorporates data pre-processing, preliminary PCA gene selection, preliminary cell ordering, feature selection, refined cell ordering, and post-analysis interpretation and visualization.
Maintained by Hao Chen. Last updated 5 months ago.
immunooncologyrnaseqguicellbiologydnaseq
3.30 scoreauroreaa
ICSClust:Tandem Clustering with Invariant Coordinate Selection
Implementation of tandem clustering with invariant coordinate selection with different scatter matrices and several choices for the selection of components as described in Alfons, A., Archimbaud, A., Nordhausen, K.and Ruiz-Gazen, A. (2022) <arXiv:2212.06108>.
Maintained by Aurore Archimbaud. Last updated 2 years ago.
3.04 score 11 scriptsliukf10
DDPNA:Disease-Drived Differential Proteins Co-Expression Network Analysis
Functions designed to connect disease-related differential proteins and co-expression network. It provides the basic statics analysis included t test, ANOVA analysis. The network construction is not offered by the package, you can used 'WGCNA' package which you can learn in Peter et al. (2008) <doi:10.1186/1471-2105-9-559>. It also provides module analysis included PCA analysis, two enrichment analysis, Planner maximally filtered graph extraction and hub analysis.
Maintained by Kefu Liu. Last updated 4 years ago.
2 stars 3.00 score 4 scriptsbioc
SigCheck:Check a gene signature's prognostic performance against random signatures, known signatures, and permuted data/metadata
While gene signatures are frequently used to predict phenotypes (e.g. predict prognosis of cancer patients), it it not always clear how optimal or meaningful they are (cf David Venet, Jacques E. Dumont, and Vincent Detours' paper "Most Random Gene Expression Signatures Are Significantly Associated with Breast Cancer Outcome"). Based on suggestions in that paper, SigCheck accepts a data set (as an ExpressionSet) and a gene signature, and compares its performance on survival and/or classification tasks against a) random gene signatures of the same length; b) known, related and unrelated gene signatures; and c) permuted data and/or metadata.
Maintained by Rory Stark. Last updated 2 months ago.
geneexpressionclassificationgenesetenrichment
3.00 score 1 scriptsvmoprojs
clusEvol:A Procedure for Cluster Evolution Analytics
Cluster Evolution Analytics allows us to use exploratory what if questions in the sense that the present information of an object is plugged-in a dataset in a previous time frame so that we can explore its evolution (and of its neighbors) to the present. See the URL for the papers associated with this package, as for instance, Morales-Oñate and Morales-Oñate (2024) <https://mpra.ub.uni-muenchen.de/120220>.
Maintained by Víctor Morales-Oñate. Last updated 8 months ago.
2.70 score 1 scriptsrmarko
semiArtificial:Generator of Semi-Artificial Data
Contains methods to generate and evaluate semi-artificial data sets. Based on a given data set different methods learn data properties using machine learning algorithms and generate new data with the same properties. The package currently includes the following data generators: i) a RBF network based generator using rbfDDA() from package 'RSNNS', ii) a Random Forest based generator for both classification and regression problems iii) a density forest based generator for unsupervised data Data evaluation support tools include: a) single attribute based statistical evaluation: mean, median, standard deviation, skewness, kurtosis, medcouple, L/RMC, KS test, Hellinger distance b) evaluation based on clustering using Adjusted Rand Index (ARI) and FM c) evaluation based on classification performance with various learning models, e.g., random forests.
Maintained by Marko Robnik-Sikonja. Last updated 4 years ago.
1.86 score 24 scripts 1 dependentscran
conjoint:An Implementation of Conjoint Analysis Method
This is a simple R package that allows to measure the stated preferences using traditional conjoint analysis method.
Maintained by Tomasz Bartlomowicz. Last updated 7 years ago.
1 stars 1.50 scorereviewburner
AnimalSequences:Analyse Animal Sequential Behaviour and Communication
All animal behaviour occurs sequentially. The package has a number of functions to format sequence data from different sources, to analyse sequential behaviour and communication in animals. It also has functions to plot the data and to calculate the entropy of sequences.
Maintained by Alex Mielke. Last updated 6 months ago.
1.00 scorebuybnb
CMMs:Compositional Mediation Model
A compositional mediation model for continuous outcome and binary outcomes to deal with mediators that are compositional data. Lin, Ziqiang et al. (2022) <doi:10.1016/j.jad.2021.12.019>.
Maintained by Ziqiang Lin. Last updated 2 years ago.
1.00 scorecran
dGAselID:Genetic Algorithm with Incomplete Dominance for Feature Selection
Feature selection from high dimensional data using a diploid genetic algorithm with Incomplete Dominance for genotype to phenotype mapping and Random Assortment of chromosomes approach to recombination.
Maintained by Nicolae Teodor Melita. Last updated 8 years ago.
1 stars 1.00 scorecran
FADPclust:Functional Data Clustering Using Adaptive Density Peak Detection
An implementation of a clustering algorithm for functional data based on adaptive density peak detection technique, in which the density is estimated by functional k-nearest neighbor density estimation based on a proposed semi-metric between functions. The proposed functional data clustering algorithm is computationally fast since it does not need iterative process. (Alex Rodriguez and Alessandro Laio (2014) <doi:10.1126/science.1242072>; Xiao-Feng Wang and Yifan Xu (2016) <doi:10.1177/0962280215609948>).
Maintained by Rui Ren. Last updated 2 years ago.
1 stars 1.00 score