Using Schematically Heterogeneous Structures in Data Warehouses

Renee J. Miller
Computer Science Department
Ohio State University

Wednesday, April 1, 1998 
9:30-10:45am
 Interschool Lab, 7th floor, Schapiro CEPSR Bldg.

Abstract

Schematic heterogeneity arises when information that is represented as data under one schema, is represented within the schema (as metadata) in another. Schematic heterogeneity is an important class of heterogeneity that arises frequently in integrating legacy data for data warehousing applications. Traditional query languages and view mechanisms are insufficient for reconciling and translating data between schematically heterogeneous schemas. Higher order query languages, that permit quantification over schema labels, have been proposed to permit querying and restructuring of data between schematically disparate schemas. We extend this work by considering how these languages can be used in practice with minimal extensions to existing query processing engines. Specifically, we consider the problem of using higher order views to answer queries in a heterogeneous environment. The talk will overview important applications of the proposed solutions beyond data integration. Specifically, our solutions permit schema browsing and new forms of data independence that are important for global information systems. In addition, the solutions permit the integration of semi-structured and unstructured query operators (such as keyword searches) into structured query optimizers.



Luis Gravano
gravano@cs.columbia.edu