Reusing Invariants: A New Strategy for Correlated Queries
Jun Rao (Columbia University)
Kenneth A. Ross (Columbia University)
Correlated queries are very common and important in decision support systems.
Traditional nested iteration evaluation methods for such queries can be very
time consuming. When they apply, query rewriting techniques have been shown to
be much more efficient. But query rewriting is not always possible. When query
rewriting does not apply, can we do something better than the traditional
nested iteration methods? In this paper, we propose a new invariant technique
to evaluate correlated queries efficiently. The basic idea is to recognize the
part of the subquery that is not related to the outer references and cache the
result of that part after its first execution. Later, we can reuse the result
and combine it with the result of the rest of the subquery that is changing
for each iteration. Our technique applies to arbitrary correlated subqueries.
This paper introduces algorithms to recognize the invariant part of a data
flow tree, and to restructure the evaluation plan to reuse the stored
intermediate result. We also propose an efficient method to teach an existing
join optimizer to understand the invariant feature and thus allow it to be
able to generate better join plans in the new context. Some other related
optimization techniques are also discussed. The proposed techniques were
implemented within three months on an existing real commercial database
system.
We also experimentally evaluate our proposed technique. Our evaluation
indicates that, when query rewriting is not possible, the invariant technique
is significantly better than the traditional nested iteration method. Even
when query rewriting applies, the invariant technique is sometimes better than
the query rewriting technique. Our conclusion is that the invariant technique
should be considered as one of the alternatives in evaluating correlated
queries since it fills the gap left by rewriting techniques.