Managing Change in Autonomous Databases

Sudarshan S. Chawathe
Computer Science Department
Stanford University

Monday, April 20, 1998 
11am-12:15n
 Interschool Lab, 7th floor, Schapiro CEPSR Bldg.

Host: Luis Gravano

Abstract

We are witnessing a rapid growth in the number and size of heterogeneous collections of autonomous databases. Individual databases in such collections are owned and managed by independent, and often competing, entities that cooperate to only a limited extent. For example, the collection of databases used in the construction of a building includes databases owned by the architect, the construction company, the electrical contractor, and so on. Such autonomous database collections are also common on the Internet. (For example, the collection of Web databases with information about San Francisco consists of databases operated by several competing entities.) Making effective use of such collections of autonomous databases presents several challenges due to the absence of traditional database facilities such as locks, transactions, and standard query languages. In particular, understanding and controlling how such databases evolve is an important problem that traditional database techniques are ill-equipped to address.

In this presentation, I first motivate the need for managing change in autonomous databases and discuss the main challenges it presents. I then describe a method for detecting changes in autonomous databases by comparing snapshots of data. This method is based on novel algorithms for computing a minimum-cost edit script between two trees. I also briefly present a data model for storing changes in autonomous databases, and a query language over data and history stored in this model. A key feature of this model and language is that they model and query changes directly, instead of as the difference between two states. I conclude by describing the implementation of a change management system that incorporates these ideas.

This work is part of The C3 Project at Stanford. Further information, including a system overview and recent publications, is available at http://www-db.stanford.edu/c3/.



Luis Gravano
gravano@cs.columbia.edu