R-universe search: som

cran

som:Self-Organizing Map

Self-Organizing Map (with application in gene clustering).

Maintained by Jun Yan. Last updated 6 months ago.

cpp

75.0 match 4.35 score 13 dependents

andreasdominik

som.nn:Topological k-NN Classifier Based on Self-Organising Maps

A topological version of k-NN: An abstract model is build as 2-dimensional self-organising map. Samples of unknown class are predicted by mapping them on the SOM and analysing class membership of neurons in the neighbourhood.

Maintained by Andreas Dominik. Last updated 12 months ago.

74.9 match 2.40 score 28 scripts

exaexa

EmbedSOM:Fast Embedding Guided by Self-Organizing Map

Provides a smooth mapping of multidimensional points into low-dimensional space defined by a self-organizing map. Designed to work with 'FlowSOM' and flow-cytometry use-cases. See Kratochvil et al. (2019) <doi:10.12688/f1000research.21642.1>.

Maintained by Mirek Kratochvil. Last updated 1 months ago.

cytometry som visualization cpp

17.4 match 26 stars 6.02 score 8 scripts

rwehrens

kohonen:Supervised and Unsupervised Self-Organising Maps

Functions to train self-organising maps (SOMs). Also interrogation of the maps and prediction using trained maps are supported. The name of the package refers to Teuvo Kohonen, the inventor of the SOM.

Maintained by Ron Wehrens. Last updated 2 years ago.

cpp openmp

11.4 match 9 stars 6.91 score 724 scripts 14 dependents

e-sensing

sits:Satellite Image Time Series Analysis for Earth Observation Data Cubes

An end-to-end toolkit for land use and land cover classification using big Earth observation data, based on machine learning methods applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>. Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, Brazil Data Cube, Copernicus Data Space Environment (CDSE), Digital Earth Africa, Digital Earth Australia, NASA HLS using the Spatio-temporal Asset Catalog (STAC) protocol (<https://stacspec.org/>) and the 'gdalcubes' R package developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>. Supports visualization methods for images and time series and smoothing filters for dealing with noisy time series. Includes functions for quality assessment of training samples using self-organized maps as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. Includes methods to reduce training samples imbalance proposed by Chawla et al (2002) <doi:10.1613/jair.953>. Provides machine learning methods including support vector machines, random forests, extreme gradient boosting, multi-layer perceptrons, temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, and temporal attention encoders by Garnot and Landrieu (2020) <doi:10.48550/arXiv.2007.00586>. Supports GPU processing of deep learning models using torch <https://torch.mlverse.org/>. Performs efficient classification of big Earth observation data cubes and includes functions for post-classification smoothing based on Bayesian inference as described by Camara et al (2024) <doi:10.3390/rs16234572>, and methods for active learning and uncertainty assessment. Supports region-based time series analysis using package supercells <https://jakubnowosad.com/supercells/>. Enables best practices for estimating area and assessing accuracy of land change as recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>. Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.

Maintained by Gilberto Camara. Last updated 1 months ago.

big-earth-data cbers earth-observation eo-datacubes geospatial image-time-series land-cover-classification landsat planetary-computer r-spatial remote-sensing rspatial satellite-image-time-series satellite-imagery sentinel-2 stac-api stac-catalog cpp

7.3 match 494 stars 9.50 score 384 scripts

cran

aweSOM:Interactive Self-Organizing Maps

Self-organizing maps (also known as SOM, see Kohonen (2001) <doi:10.1007/978-3-642-56927-2>) are a method for dimensionality reduction and clustering of continuous data. This package introduces interactive (html) graphics for easier analysis of SOM results. It also features an interactive interface, for push-button training and visualization of SOM on numeric, categorical or mixed data, as well as tools to evaluate the quality of SOM.

Maintained by Julien Boelaert. Last updated 3 years ago.

21.2 match 3 stars 2.95 score 1 dependents

cran

SOMbrero:SOM Bound to Realize Euclidean and Relational Outputs

The stochastic (also called on-line) version of the Self-Organising Map (SOM) algorithm is provided. Different versions of the algorithm are implemented, for numeric and relational data and for contingency tables as described, respectively, in Kohonen (2001) <isbn:3-540-67921-9>, Olteanu & Villa-Vialaneix (2005) <doi:10.1016/j.neucom.2013.11.047> and Cottrell et al (2004) <doi:10.1016/j.neunet.2004.07.010>. The package also contains many plotting features (to help the user interpret the results), can handle (and impute) missing values and is delivered with a graphical user interface based on 'shiny'.

Maintained by Nathalie Vialaneix. Last updated 1 years ago.

12.4 match 1 stars 4.32 score 115 scripts 1 dependents

somenv

SOMEnv:SOM Algorithm for the Analysis of Multivariate Environmental Data

Analysis of multivariate environmental high frequency data by Self-Organizing Map and k-means clustering algorithms. By means of the graphical user interface it provides a comfortable way to elaborate by self-organizing map algorithm rather big datasets (txt files up to 100 MB ) obtained by environmental high-frequency monitoring by sensors/instruments. The functions present in the package are based on 'kohonen' and 'openair' packages implemented by functions embedding Vesanto et al. (2001) <http://www.cis.hut.fi/projects/somtoolbox/package/papers/techrep.pdf> heuristic rules for map initialization parameters, k-means clustering algorithm and map features visualization. Cluster profiles visualization as well as graphs dedicated to the visualization of time-dependent variables Licen et al. (2020) <doi:10.4209/aaqr.2019.08.0414> are provided.

Maintained by Sabina Licen. Last updated 4 years ago.

19.4 match 1 stars 2.70 score

pik-piam

magpie4:MAgPIE outputs R package for MAgPIE version 4.x

Common output routines for extracting results from the MAgPIE framework (versions 4.x).

Maintained by Benjamin Leon Bodirsky. Last updated 2 days ago.

6.6 match 2 stars 7.87 score 254 scripts 9 dependents

cran

class:Functions for Classification

Various functions for classification, including k-nearest neighbour, Learning Vector Quantization and Self-Organizing Maps.

Maintained by Brian Ripley. Last updated 2 months ago.

5.3 match 1 stars 7.83 score 2.2k dependents

bioc

oposSOM:Comprehensive analysis of transcriptome data

This package translates microarray expression data into metadata of reduced dimension. It provides various sample-centered and group-centered visualizations, sample similarity analyses and functional enrichment analyses. The underlying SOM algorithm combines feature clustering, multidimensional scaling and dimension reduction, along with strong visualization capabilities. It enables extraction and description of functional expression modules inherent in the data.

Maintained by Henry Loeffler-Wirth. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment datarepresentation visualization cpp

8.6 match 4.48 score 7 scripts

cbergmeir

RSNNS:Neural Networks using the Stuttgart Neural Network Simulator (SNNS)

The Stuttgart Neural Network Simulator (SNNS) is a library containing many standard implementations of neural networks. This package wraps the SNNS functionality to make it available from within R. Using the 'RSNNS' low-level interface, all of the algorithmic functionality and flexibility of SNNS can be accessed. Furthermore, the package contains a convenient high-level interface, so that the most common neural network topologies and learning algorithms integrate seamlessly into R.

Maintained by Christoph Bergmeir. Last updated 1 years ago.

cpp

3.3 match 26 stars 8.90 score 426 scripts 9 dependents

imarkonis

somspace:Spatial Analysis with Self-Organizing Maps

Application of the Self-Organizing Maps technique for spatial classification of time series. The package uses spatial data, point or gridded, to create clusters with similar characteristics. The clusters can be further refined to a smaller number of regions by hierarchical clustering and their spatial dependencies can be presented as complex networks. Thus, meaningful maps can be created, representing the regional heterogeneity of a single variable. More information and an example of implementation can be found in Markonis and Strnad (2020, <doi:10.1177/0959683620913924>).

Maintained by Yannis Markonis. Last updated 2 years ago.

10.6 match 2.70 score 4 scripts

gianluca-pastorelli

somhca:Self-Organising Maps Coupled with Hierarchical Cluster Analysis

Implements self-organising maps combined with hierarchical cluster analysis (SOM-HCA) for clustering and visualization of high-dimensional data. The package includes functions to estimate the optimal map size based on various quality measures and to generate a model using the selected dimensions. It also performs hierarchical clustering on the map nodes to group similar units. Documentation about the SOM-HCA method is provided in Pastorelli et al. (2024) <doi:10.1002/xrs.3388>.

Maintained by Gianluca Pastorelli. Last updated 2 months ago.

8.2 match 3.18 score

bioc

FlowSOM:Using self-organizing maps for visualization and interpretation of cytometry data

FlowSOM offers visualization options for cytometry data, by using Self-Organizing Map clustering and Minimal Spanning Trees.

Maintained by Sofie Van Gassen. Last updated 5 months ago.

cellbiology flowcytometry clustering visualization software cellbasedassays

3.3 match 7.71 score 468 scripts 10 dependents

bioc

CATALYST:Cytometry dATa anALYSis Tools

CATALYST provides tools for preprocessing of and differential discovery in cytometry data such as FACS, CyTOF, and IMC. Preprocessing includes i) normalization using bead standards, ii) single-cell deconvolution, and iii) bead-based compensation. For differential discovery, the package provides a number of convenient functions for data processing (e.g., clustering, dimension reduction), as well as a suite of visualizations for exploratory data analysis and exploration of results from differential abundance (DA) and state (DS) analysis in order to identify differences in composition and expression profiles at the subpopulation-level, respectively.

Maintained by Helena L. Crowell. Last updated 4 months ago.

clustering dataimport differentialexpression experimentaldesign flowcytometry immunooncology massspectrometry normalization preprocessing singlecell software statisticalmethod visualization

1.9 match 67 stars 11.06 score 362 scripts 2 dependents

mottastefano

SOMMD:Self Organising Maps for the Analysis of Molecular Dynamics Data

Processes data from Molecular Dynamics simulations using Self Organising Maps. Features include the ability to read different input formats. Trajectories can be analysed to identify groups of important frames. Output visualisation can be generated for maps and pathways. Methodological details can be found in Motta S et al (2022) <doi:10.1021/acs.jctc.1c01163>. I/O functions for xtc format files were implemented using the 'xdrfile' library available under open source license. The relevant information can be found in inst/COPYRIGHT.

Maintained by Stefano Motta. Last updated 6 months ago.

9.8 match 1.70 score 4 scripts

bioc

cola:A Framework for Consensus Partitioning

Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.

Maintained by Zuguang Gu. Last updated 1 months ago.

clustering geneexpression classification software consensus-clustering cpp

1.9 match 61 stars 7.49 score 112 scripts

cran

Mercator:Clustering and Visualizing Distance Matrices

Defines the classes used to explore, cluster and visualize distance matrices, especially those arising from binary data. See Abrams and colleagues, 2021, <doi:10.1093/bioinformatics/btab037>.

Maintained by Kevin R. Coombes. Last updated 5 months ago.

clustering

3.1 match 4.33 score 12 scripts 1 dependents

filzmoserp

chemometrics:Multivariate Statistical Analysis in Chemometrics

R companion to the book "Introduction to Multivariate Statistical Analysis in Chemometrics" written by K. Varmuza and P. Filzmoser (2009).

Maintained by Peter Filzmoser. Last updated 2 years ago.

2.0 match 4 stars 6.72 score 213 scripts 4 dependents

ambarish-chattopadhyay

FSM:Finite Selection Model

Randomized and balanced allocation of units to treatment groups using the Finite Selection Model (FSM). The FSM was originally proposed and developed at the RAND corporation by Carl Morris to enhance the experimental design for the now famous Health Insurance Experiment. See Morris (1979) <doi:10.1016/0304-4076(79)90053-8> for details on the original version of the FSM.

Maintained by Ambarish Chattopadhyay. Last updated 4 years ago.

5.2 match 2.00 score 3 scripts

oldlipe

ggsom:ggsom

Tool for visualization of SOMs object.

Maintained by Felipe Carvalho. Last updated 5 years ago.

ggplot-extension plot visualization

2.4 match 13 stars 3.81 score 5 scripts

agrocares

OBIC:Calculate the Open Bodem Index (OBI) Score

The Open Bodem Index (OBI) is a method to evaluate the quality of soils of agricultural fields in The Netherlands and the sustainability of the current agricultural practices. The OBI score is based on four main criteria: chemical, physical, biological and management, which consist of more than 21 indicators. By providing results of a soil analysis and management info the 'OBIC' package can be use to calculate he scores, indicators and derivatives that are used by the OBI. More information about the Open Bodem Index can be found at <https://openbodemindex.nl/>.

Maintained by Sven Verweij. Last updated 6 months ago.

agriculture soil

1.3 match 11 stars 6.82 score 20 scripts

blansche

fdm2id:Data Mining and R Programming for Beginners

Contains functions to simplify the use of data mining methods (classification, regression, clustering, etc.), for students and beginners in R programming. Various R packages are used and wrappers are built around the main functions, to standardize the use of data mining methods (input/output): it brings a certain loss of flexibility, but also a gain of simplicity. The package name came from the French "Fouille de Données en Master 2 Informatique Décisionnelle".

Maintained by Alexandre Blansché. Last updated 2 years ago.

5.2 match 1 stars 1.62 score 42 scripts

dicook

mulgar:Functions for Pre-Processing Data for Multivariate Data Visualisation using Tours

This is a companion to the book Cook, D. and Laa, U. (2023) <https://dicook.github.io/mulgar_book/> "Interactively exploring high-dimensional data and models in R". by Cook and Laa. It contains useful functions for processing data in preparation for visualising with a tour. There are also several sample data sets.

Maintained by Dianne Cook. Last updated 2 months ago.

1.8 match 4 stars 4.50 score 79 scripts

bafuentes

rassta:Raster-Based Spatial Stratification Algorithms

Algorithms for the spatial stratification of landscapes, sampling and modeling of spatially-varying phenomena. These algorithms offer a simple framework for the stratification of geographic space based on raster layers representing landscape factors and/or factor scales. The stratification process follows a hierarchical approach, which is based on first level units (i.e., classification units) and second-level units (i.e., stratification units). Nonparametric techniques allow to measure the correspondence between the geographic space and the landscape configuration represented by the units. These correspondence metrics are useful to define sampling schemes and to model the spatial variability of environmental phenomena. The theoretical background of the algorithms and code examples are presented in Fuentes, Dorantes, and Tipton (2021). <doi:10.31223/X50S57>.

Maintained by Bryan A. Fuentes. Last updated 3 years ago.

ecology geoinformatics hierarchical modeling sampling spatial

1.3 match 16 stars 5.96 score 19 scripts

lutzhamel

popsom7:A Fast, User-Friendly Implementation of Self-Organizing Maps (SOMs)

Methods for building self-organizing maps (SOMs) with a number of distinguishing features such automatic centroid detection and cluster visualization using starbursts. For more details see the paper "Improved Interpretability of the Unified Distance Matrix with Connected Components" by Hamel and Brown (2011) in <ISBN:1-60132-168-6>. The package provides user-friendly access to two models we construct: (a) a SOM model and (b) a centroid based clustering model. The package also exposes a number of quality metrics for the quantitative evaluation of the map, Hamel (2016) <doi:10.1007/978-3-319-28518-4_4>. Finally, we reintroduced our fast, vectorized training algorithm for SOM with substantial improvements. It is about an order of magnitude faster than the canonical, stochastic C implementation <doi:10.1007/978-3-030-01057-7_60>.

Maintained by Lutz Hamel. Last updated 14 days ago.

fortran

5.7 match 1.30 score 2 scripts

rejebsara

missSOM:Self-Organizing Maps with Built-in Missing Data Imputation

The Self-Organizing Maps with Built-in Missing Data Imputation. Missing values are imputed and regularly updated during the online Kohonen algorithm. Our method can be used for data visualisation, clustering or imputation of missing data. It is an extension of the online algorithm of the 'kohonen' package. The method is described in the article "Self-Organizing Maps for Exploration of Partially Observed Data and Imputation of Missing Values" by S. Rejeb, C. Duveau, T. Rebafka (2022) <arXiv:2202.07963>.

Maintained by Sara Rejeb. Last updated 3 years ago.

cpp

7.3 match 1.00 score

cran

simqi:Simulate Quantities of Interest from Regression Models

This is an all-encompassing suite to facilitate the simulation of so-called quantities of interest by way of a multivariate normal distribution of the regression model's coefficients and variance-covariance matrix.

Maintained by Steve Miller. Last updated 27 days ago.

3.5 match 1.70 score 2 scripts

lark-max

DSAM:Data Splitting Algorithms for Model Developments

Providing six different algorithms that can be used to split the available data into training, test and validation subsets with similar distribution for hydrological model developments. The dataSplit() function will help you divide the data according to specific requirements, and you can refer to the par.default() function to set the parameters for data splitting. The getAUC() function will help you measure the similarity of distribution features between the data subsets. For more information about the data splitting algorithms, please refer to: Chen et al. (2022) <doi:10.1016/j.jhydrol.2022.128340>, Zheng et al. (2022) <doi:10.1029/2021WR031818>.

Maintained by Junyi Chen. Last updated 2 months ago.

1.8 match 2 stars 3.00 score 2 scripts

cran

Numero:Statistical Framework to Define Subgroups in Complex Datasets

High-dimensional datasets that do not exhibit a clear intrinsic clustered structure pose a challenge to conventional clustering algorithms. For this reason, we developed an unsupervised framework that helps scientists to better subgroup their datasets based on visual cues, please see Gao S, Mutter S, Casey A, Makinen V-P (2019) Numero: a statistical framework to define multivariable subgroups in complex population-based datasets, Int J Epidemiology, 48:369-37, <doi:10.1093/ije/dyy113>. The framework includes the necessary functions to construct a self-organizing map of the data, to evaluate the statistical significance of the observed data patterns, and to visualize the results.

Maintained by Ville-Petteri Makinen. Last updated 6 months ago.

cpp

1.9 match 2.30 score 3 scripts

sarra25

multisom:Clustering a Data Set using Multi-SOM Algorithm

Implements two versions of the algorithm namely: stochastic and batch. The package determines also the best number of clusters and offers to the user the best clustering scheme from different results.

Maintained by Sarra Chair. Last updated 8 years ago.

2.9 match 1.00 score 3 scripts

mthrun

ProjectionBasedClustering:Projection Based Clustering

A clustering approach applicable to every projection method is proposed here. The two-dimensional scatter plot of any projection method can construct a topographic map which displays unapparent data structures by using distance and density information of the data. The generalized U*-matrix renders this visualization in the form of a topographic map, which can be used to automatically define the clusters of high-dimensional data. The whole system is based on Thrun and Ultsch, "Using Projection based Clustering to Find Distance and Density based Clusters in High-Dimensional Data" <DOI:10.1007/s00357-020-09373-2>. Selecting the correct projection method will result in a visualization in which mountains surround each cluster. The number of clusters can be determined by counting valleys on the topographic map. Most projection methods are wrappers for already available methods in R. By contrast, the neighbor retrieval visualizer (NeRV) is based on C++ source code of the 'dredviz' software package, and the Curvilinear Component Analysis (CCA) is translated from 'MATLAB' ('SOM Toolbox' 2.0) to R.

Maintained by Michael Thrun. Last updated 3 years ago.

cpp

0.5 match 7 stars 5.33 score 34 scripts 3 dependents

cran

Umatrix:Visualization of Structures in High-Dimensional Data

By gaining the property of emergence through self-organization, the enhancement of SOMs(self organizing maps) is called Emergent SOM (ESOM). The result of the projection by ESOM is a grid of neurons which can be visualised as a three dimensional landscape in form of the Umatrix. Further details can be found in the referenced publications (see url). This package offers tools for calculating and visualising the ESOM as well as Umatrix, Pmatrix and UStarMatrix. All the functionality is also available through graphical user interfaces implemented in 'shiny'. Based on the recognized data structures, the method can be used to generate new data.

Maintained by Jorn Lotsch. Last updated 7 months ago.

cpp

0.8 match 1 stars 2.16 score 12 scripts