Approximation Clustering Models and Methods For Various Data Formats
Abstract
Clustering is a discipline devoted to finding and describing homogeneous
groups of data entities. In contrast to conventional clustering which involves
data processing in terms of either entities or variables, approximation
clustering is aimed at processing the data matrices as they are. The principal
idea is to approximate a given data table by a ``cleaned'' model matrix
corresponding to a cluster structure. We consider three types of data tables
(those of (dis)similarity, object-to-variable, and of contingency) and
three types of cluster structures (single clusters, partitions, and hierarchies).
This leads to putting a considerable part of the existing clustering techniques
into a unified mathematical framework along with producing advanced computational
procedures and interpretation aids.
Luis Gravano
gravano@cs.columbia.edu