The model-based strategy clustering method is good one in hierarchical clustering where the datasets are using the Gaussian terms in practice, but the results are suboptimal divisions. The EM value, therefore, when happening near the most favorable rate, can refine the partitions. In determining the number of clusters, it is possible to obtain good results when model-based hierarchical agglomeration produces partitions as the EM starting values of the algorithm, and with the BIC, the constant-shape Gaussian models determine the number of clusters that are there (Malsiner-Walli, Pauger & Wagner, 2018).
The approach above is the foundation of a more general clustering model-based technique: Consider the parameterizations set if the candidate of the Gaussian model used and let M be the highest number of clusters. Therefore, the number of clusters should be as smallest as possible. For the (M) group of the clusters, calculate the corresponding classifications using the hierarchical clustering for Gaussian unconstrained model. Start with the designation for each number of clusters from 2 to M and parameterized dataset using the EM value (Malsiner-Walli et al., 2018).
The next step is to calculate the BIC of each cluster model of the parameterized sets. Then use the mixture of the optimal parameters ranging from 2 to M of the EM value in the M clusters. The resulting value is the BIC matrix that corresponds to the number of groups and each possible parameterization combination. Finally, a model is plotted for the BIC values. For a model (number of clusters + parameterization) is evidence when of the absolute first local maximum.
Malsiner-Walli, G., Pauger, D., & Wagner, H. (2018). Effect fusion using model-based clustering. Statistical Modelling, 18(2), 175-196.
These are the 2 below cases which do not automatically determine the number of clusters:
- If you are clustering and need to reduce to a specific size, then you need to know how many clusters are created.
- It also presents a problem with hierarchal clustering as subclusters are ignored.
Cluster analysis. (2018, November 11). Retrieved from https://en.wikipedia.org/wiki/Cluster_analysis