Database Query Processing in Main Memory

Kenneth A. Ross
Department of Computer Science
Columbia University

Contact Information

1214 Amsterdam Avenue, Mail Code 0401
New York, NY 10027
Phone: (212) 939-7058
Fax : (212) 666-0140
Email: kar@cs.columbia.edu

WWW PAGES

Kenneth Ross: http://www.cs.columbia.edu/~kar
The Columbia Fast Query Project: http://www.cs.columbia.edu/~kar/fastqueryproj.html

Project Award Information

Keywords

Query processing, main memory databases, cache performance

Project Summary

The goal of this research project is to develop new query evaluation and optimization techniques for processing relational queries in main-memory databases. Recent improvements in main memory size and cost suggest that large classes of applications may obtain the performance benefits of having the data in main memory. The project focuses on issues in computer architecture that influence main memory performance, including cache miss latency and branch misprediction penalties. Algorithms for database operations will be developed that are sensitive to these and other issues, and thus perform well on modern architectures. Broader questions such as how to design a comprehensive architecture-sensitive query processing framework will be studied. The algorithms and techniques resulting from this project could have application in commercial and experimental database systems, where they could improve the speed of query processing. Results from the project will be disseminated as research papers and as freely available prototype software.

Publications and Products

Goals, Objectives, and Targeted Activities

Our project focuses on applications and data sets for which in-memory CPU and memory-related performance is a significant component of the query response time.  Recent reports have suggested that even disk-based database systems are often CPU or memory-bound for some workloads.  We aim to better utilize architectural features of commodity processors and memory in order to speed up query execution.  Examples of architectural characteristics with significant impact on performance are: We aim to study these issues as they affect query performance.  New algorithms for query processing will be developed.  Earlier work in query processing in main memory is described in the report for project IIS-9812014.

A PODS 2002 paper shows how to perform conjunctive selections in main memory in a way that avoids branch misprediction penalties.

A SIGMOD 2002 paper describes implementation techniques for database operations using SIMD instruction sets available on most modern commodity processors.

Project References

See "Publications and Products" above.

Area Background

 Query optimization and evaluation algorithms enable faster execution plans to be used, resulting in better response times for queries.

Area References

"Database Management Systems" by Raghu Ramakrishnan and Johannes Gehrke.  McGraw Hill.
"Principles of Database and Knowledge-Base Systems" by Jeffrey Ullman.  Computer Science Press.

Potential Related Projects

I would be interested in finding out from others about the kinds of complex queries (on small or large datasets) that they have encountered during their work.  The longer the query the better!