Abstract
In this paper we address the problem of clustering interval data, adopting a model-based approach. To this purpose, parametric models for interval-valued variables are used which consider configurations for the variance-covariance matrix that take the nature of the interval data directly into account. Results, both on synthetic and empirical data, clearly show the well-founding of the proposed approach. The method succeeds in finding parsimonious heterocedastic models which is a critical feature in many applications. Furthermore, the analysis of the different data sets made clear the need to explicitly consider the intrinsic variability present in interval data.
Original language | English |
---|---|
Pages (from-to) | 293-313 |
Number of pages | 21 |
Journal | Intelligent Data Analysis |
Volume | 19 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2015 |
Keywords
- Clustering methods
- Finite mixture models
- Interval-valued variable
- Intrinsic variability
- Symbolic data