Database Query Processing in Main Memory

Award Number IIS-0120939


Principal Investigator

Kenneth A. Ross
Department of Computer Science
Columbia University

1214 Amsterdam Avenue, Mail Code 0401
New York, NY 10027
Phone: (212) 939-7058
Fax : (212) 666-0140
Email: kar@cs.columbia.edu

URL: http://www.cs.columbia.edu/~kar

Keywords

Query processing, main memory databases, cache performance

Project Summary

The goal of this research project is to develop new query evaluation and optimization techniques for processing relational queries in main-memory databases. Recent improvements in main memory size and cost suggest that large classes of applications may obtain the performance benefits of having the data in main memory. The project focuses on issues in computer architecture that influence main memory performance, including cache miss latency and branch misprediction penalties. Algorithms for database operations will be developed that are sensitive to these and other issues, and thus perform well on modern architectures. Broader questions such as how to design a comprehensive architecture-sensitive query processing framework will be studied. The algorithms and techniques resulting from this project could have application in commercial and experimental database systems, where they could improve the speed of query processing. Results from the project will be disseminated as research papers and as freely available prototype software.

Publications and Products

  • “Conjunctive Selection Conditions in Main Memory,” K. A. Ross, Proceedings of ACM PODS Conference, June 2002.
  • “Implementing Database Operations using SIMD Instructions,” J. Zhou and K. A. Ross, Proceedings of ACM SIGMOD Conference, June 2002.
  • “Buffering Accesses to Memory-Resident Index Structures.”  J. Zhou and K. A. Ross, Proceedings of the 2003 VLDB Conference, September 2003.
  • “A Multi-Resolution Block Storage Model for Database Design”  J. Zhou and K. A. Ross, Proceedings of the 2003 IDEAS Conference, July 2003.

Project Impact

As modern computer systems become more and more sensitive to memory and CPU performance, we expect database system designers to implement techniques like those we propose in order to improve performance.

Goals, Objectives and Targeted Activities

Our project focuses on applications and data sets for which in-memory CPU and memory-related performance is a significant component of the query response time.  Recent reports have suggested that even disk-based database systems are often CPU or memory-bound for some workloads.  We aim to better utilize architectural features of commodity processors and memory in order to speed up query execution.  Examples of architectural characteristics with significant impact on performance are:

  • memory latency that occurs as a result of cache misses.
  • branch misprediction.
  • the availability of SIMD instructions for parallel execution of an instruction on multiple arguments.

We aim to study these issues as they affect query performance.  New algorithms for query processing will be developed.

 

A PODS 2002 paper shows how to perform conjunctive selections in main memory in a way that avoids branch misprediction penalties.

 

A SIGMOD 2002 paper describes implementation techniques for database operations using SIMD instruction sets available on most modern commodity processors.

 

An IDEAS 2003 paper describes ways to lay out tables on disk to get both good I/O and CPU-cache performance for operations such as scans in which some but not all attributes are required.

 

A VLDB 2003 paper shows how to optimise tree-based indexes for cache performance under workloads consisting of bulk search operations, such as would be found in a stream processing application or within an index-nested-loops join.

Area Background

Query optimization and evaluation algorithms enable faster execution plans to be used, resulting in better response times for queries.

Area References

"Database Management Systems" by Raghu Ramakrishnan and Johannes Gehrke.  McGraw Hill.
"Principles of Database and Knowledge-Base Systems" by Jeffrey Ullman.  Computer Science Press.

Project Websites

Project URL
The project web page gives an overview of the kinds of research being done, and provides links to software and other related projects.