clusterMI:Cluster Analysis with Missing Values by Multiple Imputation
Allows clustering of incomplete observations by addressing missing values using multiple imputation. For achieving this
goal, the methodology consists in three steps, following
Audigier and Niang 2022 <doi:10.1007/s11634-022-00519-1>. I)
Missing data imputation using dedicated models. Four multiple
imputation methods are proposed, two are based on joint
modelling and two are fully sequential methods, as discussed in
Audigier et al. (2021) <doi:10.48550/arXiv.2106.04424>. II)
cluster analysis of imputed data sets. Six clustering methods
are available (distances-based or model-based), but custom
methods can also be easily used. III) Partition pooling. The
set of partitions is aggregated using Non-negative Matrix
Factorization based method. An associated instability measure
is computed by bootstrap (see Fang, Y. and Wang, J., 2012
<doi:10.1016/j.csda.2011.09.003>). Among applications, this
instability measure can be used to choose a number of clusters
with missing values. The package also proposes several
diagnostic tools to tune the number of imputed data sets, to
tune the number of iterations in fully sequential imputation,
to check the fit of imputation models, etc.