Showing 3 of total 3 results (show query)
gagolews
genieclust:Fast and Robust Hierarchical Clustering with Noise Points Detection
A retake on the Genie algorithm (Gagolewski, 2021 <DOI:10.1016/j.softx.2021.100722>), which is a robust hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 <DOI:10.1016/j.ins.2016.05.003>). It is now faster and more memory efficient; determining the whole cluster hierarchy for datasets of 10M points in low dimensional Euclidean spaces or 100K points in high-dimensional ones takes only a minute or so. Allows clustering with respect to mutual reachability distances so that it can act as a noise point detector or a robustified version of 'HDBSCAN*' (that is able to detect a predefined number of clusters and hence it does not dependent on the somewhat fragile 'eps' parameter). The package also features an implementation of inequality indices (e.g., Gini and Bonferroni), external cluster validity measures (e.g., the normalised clustering accuracy, the adjusted Rand index, the Fowlkes-Mallows index, and normalised mutual information), and internal cluster validity indices (e.g., the Calinski-Harabasz, Davies-Bouldin, Ball-Hall, Silhouette, and generalised Dunn indices). See also the 'Python' version of 'genieclust' available on 'PyPI', which supports sparse data, more metrics, and even larger datasets.
Maintained by Marek Gagolewski. Last updated 5 days ago.
cluster-analysisclusteringclustering-algorithmdata-analysisdata-miningdata-sciencegeniehdbscanhierarchical-clusteringhierarchical-clustering-algorithmmachine-learningmachine-learning-algorithmsmlpacknmslibpythonpython3sparsecppopenmp
11.0 match 61 stars 7.29 score 13 scripts 5 dependentsvirgesmith
humanleague:Synthetic Population Generator
Generates high-entropy integer synthetic populations from marginal and (optionally) seed data using quasirandom sampling, in arbitrary dimensionality (Smith, Lovelace and Birkin (2017) <doi:10.18564/jasss.3550>). The package also provides an implementation of the Iterative Proportional Fitting (IPF) algorithm (Zaloznik (2011) <doi:10.13140/2.1.2480.9923>).
Maintained by Andrew Smith. Last updated 5 months ago.
c-plus-plus-11microsynthesisnodejspython3quasirandomsampling-methodscpp
11.0 match 18 stars 4.99 score 12 scriptsjimduggan
pysd2r:API to 'Python' Library 'pysd'
Using the R package 'reticulate', this package creates an interface to the 'pysd' toolset. The package provides an R interface to a number of 'pysd' functions, and can read files in 'Vensim' 'mdl' format, and 'xmile' format. The resulting simulations are returned as a 'tibble', and from that the results can be processed using 'dplyr' and 'ggplot2'. The package has been tested using 'python3'.
Maintained by Jim Duggan. Last updated 7 years ago.
0.5 match 12 stars 5.32 score 35 scripts