Using Schematically Heterogeneous Structures in Data Warehouses
Wednesday, April 1, 1998
9:30-10:45am
Interschool Lab, 7th floor, Schapiro CEPSR Bldg.
Abstract
Schematic heterogeneity arises when information that is represented as
data under one schema, is represented within the schema (as metadata) in
another. Schematic heterogeneity is an important class of heterogeneity
that arises frequently in integrating legacy data for data warehousing
applications. Traditional query languages and view mechanisms are insufficient
for reconciling and translating data between schematically heterogeneous
schemas. Higher order query languages, that permit quantification over
schema labels, have been proposed to permit querying and restructuring
of data between schematically disparate schemas. We extend this work by
considering how these languages can be used in practice with minimal extensions
to existing query processing engines. Specifically, we consider the problem
of using higher order views to answer queries in a heterogeneous environment.
The talk will overview important applications of the proposed solutions
beyond data integration. Specifically, our solutions permit schema browsing
and new forms of data independence that are important for global information
systems. In addition, the solutions permit the integration of semi-structured
and unstructured query operators (such as keyword searches) into structured
query optimizers.
Luis Gravano
gravano@cs.columbia.edu