In mixture model-based clustering applications, it is common to fit several
models from a family and report clustering results from only the `best' one. In
such circumstances, selection of this best model is achieved using a model
selection criterion, most often the Bayesian information criterion. Rather than
throw away all but the best model, we average multiple models that are in some
sense close to the best one, thereby producing a weighted average of clustering
results. Two (weighted) averaging approaches are considered: averaging the
component membership probabilities and averaging models. In both cases, Occam's
window is used to determine closeness to the best model and weights are
computed within a Bayesian model averaging paradigm. In some cases, we need to
merge components before averaging; we introduce a method for merging mixture
components based on the adjusted Rand index. The effectiveness of our
model-based clustering averaging approaches is illustrated using a family of
Gaussian mixture models on real and simulated data.