Database Query Processing in Main Memory
Kenneth A. Ross
Department of Computer Science
Columbia University
Contact Information
1214 Amsterdam Avenue, Mail Code 0401
New York, NY 10027
Phone: (212) 939-7058
Fax : (212) 666-0140
Email: kar@cs.columbia.edu
WWW PAGES
Kenneth Ross: http://www.cs.columbia.edu/~kar
The Columbia Fast Query Project: http://www.cs.columbia.edu/~kar/fastqueryproj.html
Project Award Information
-
Award Number: IIS-0120939
-
Duration: 10/1/01-9/30/04; 1st year.
-
Title: Database Query Processing in Main Memory
Keywords
Query processing, main memory databases, cache performance
Project Summary
The goal of this research project is to develop new query evaluation and
optimization techniques for processing relational queries in main-memory
databases. Recent improvements in main memory size and cost suggest that
large classes of applications may obtain the performance benefits of having
the data in main memory. The project focuses on issues in computer architecture
that influence main memory performance, including cache miss latency and
branch misprediction penalties. Algorithms for database operations will
be developed that are sensitive to these and other issues, and thus perform
well on modern architectures. Broader questions such as how to design a
comprehensive architecture-sensitive query processing framework will be
studied. The algorithms and techniques resulting from this project could
have application in commercial and experimental database systems, where
they could improve the speed of query processing. Results from the project
will be disseminated as research papers and as freely available prototype
software.
Publications and Products
-
"Conjunctive Selection Conditions in Main Memory," K. A.
Ross, Proceedings of ACM PODS Conference, June 2002.
-
"Implementing Database Operations using SIMD Instructions,"
J. Zhou and K. A. Ross, Proceedings of ACM SIGMOD Conference, June 2002.
Goals, Objectives, and Targeted Activities
Our project focuses on applications and data sets for which in-memory CPU
and memory-related performance is a significant component of the query
response time. Recent reports have suggested that even disk-based
database systems are often CPU or memory-bound for some workloads.
We aim to better utilize architectural features of commodity processors
and memory in order to speed up query execution. Examples of architectural
characteristics with significant impact on performance are:
-
memory latency that occurs as a result of cache misses.
-
branch misprediction.
-
the availability of SIMD instructions for parallel execution of an instruction
on multiple arguments.
We aim to study these issues as they affect query performance. New
algorithms for query processing will be developed. Earlier work in
query processing in main memory is described in the
report for project IIS-9812014.
A PODS 2002 paper shows how to perform conjunctive selections in main
memory in a way that avoids branch misprediction penalties.
A SIGMOD 2002 paper describes implementation techniques for database
operations using SIMD instruction sets available on most modern commodity
processors.
Project References
See "Publications and Products" above.
Area Background
Query optimization and evaluation algorithms enable faster execution
plans to be used, resulting in better response times for queries.
Area References
"Database Management Systems" by Raghu Ramakrishnan and Johannes
Gehrke. McGraw Hill.
"Principles of Database and Knowledge-Base Systems" by
Jeffrey Ullman. Computer Science Press.
Potential Related Projects
I would be interested in finding out from others about the kinds of complex
queries (on small or large datasets) that they have encountered during
their work. The longer the query the better!