Exploratory tools in model-based clustering

Augustin Mayo-Iscar

Universidad de Valladolid, Spain


How to adequately choose the number of groups and how to measure the strength of group-assignments are two frequent requirements in Cluster Analysis. The answer to these two important questions is clearly dependent on the assumed formal "cluster-structure". For instance, we should specify in advance which is the type of different group scatters that we are allowing for. Moreover, a clear distinction should be made between what it is understood by a "proper" group compared to what it may be understood by a (small) fraction of outlying data. Some methods have been recently introduced which are able to handle different types of constraints in the group scatter matrices and to deal with the presence of contaminating data. For instance, with these ideas in mind, the so-called TCLUST method has been proposed. This general framework for doing robust model-based clustering can be applied here to design some exploratory tools that turn out to be very useful when trying to provide appropriate answers to the previously stated key questions. Two graphical exploratory displays will be presented together with some justifications for their practical use. The monitoring of optimal values of the target functions defining the robust clustering problem together with the use of some Bayes factors will be the keystone of these graphical approaches.