A Relational Query Processing Approach to 
Compiling Sparse Matrix Codes

Vladimir Kotlyar
Department of Computer Science
Cornell University

Monday, March 9, 1998 
11am-12:15n
 Interschool Lab, 7th floor, Schapiro CEPSR Bldg.

Host: Ken Ross

Abstract

Sparse matrices are at the core of many important applications in computational science and information retrieval. These matrices are stored in complex compressed storage formats to reduce space and time requirements. This complexity makes program development tedious and error-prone. Furthermore, since a large variety of formats are used in practice to exploit properties of the application or machine architecture, it is nearly impossible to develop comprehensive libraries.

For my dissertation, I have implemented the Bernoulli compiler which automatically generates sparse matrix codes, given the corresponding dense codes and a specification of the data formats. The main problems are the restructuring of computations to produce efficient simultaneous enumeration over multiple data structures, and the development of a proper interface for the specification of sparse matrix formats.

The key insight to solving these problems is that we can view sparse and dense arrays as relations, and the execution of nested loops as the evaluation of certain queries over these relations. Storage formats are specified to the compiler through search and enumeration access methods. Code restructuring is then formulated as a search for the most efficient plan for the query. This software architecture not only allows for a clean design of the compiler, but it also exposes additional opportunities for code optimization and has led us to more general transformation algorithms than previously reported in the literature.

Experiments on the IBM SP-2 show that the performance of the code generated by the Bernoulli compiler for common numerical kernels is within 2-4% of the performance of hand-written libraries. The compiler is being used in a Grand Challenge project at Cornell on fracture mechanics simulation.



Luis Gravano
gravano@cs.columbia.edu