The Complexity of Transformation-Based Join Enumeration.

Arjan Pellenkoft, César A. Galindo-Legaria, Martin L. Kersten: The Complexity of Transformation-Based Join Enumeration. VLDB 1997: 306-315
  author    = {Arjan Pellenkoft and
               C{\'e}sar A. Galindo-Legaria and
               Martin L. Kersten},
  editor    = {Matthias Jarke and
               Michael J. Carey and
               Klaus R. Dittrich and
               Frederick H. Lochovsky and
               Pericles Loucopoulos and
               Manfred A. Jeusfeld},
  title     = {The Complexity of Transformation-Based Join Enumeration},
  booktitle = {VLDB'97, Proceedings of 23rd International Conference on Very
               Large Data Bases, August 25-29, 1997, Athens, Greece},
  publisher = {Morgan Kaufmann},
  year      = {1997},
  isbn      = {1-55860-470-7},
  pages     = {306-315},
  ee        = {db/conf/vldb/PellenkoftLK97.html},
  crossref  = {DBLP:conf/vldb/97},
  bibsource = {DBLP,}


Query optimizers that explore a search space exhaustively using transformation rules usually apply all possible rules on each alternative, and stop when no new information is produced. A memoizing structure was proposed in [McK93] to improve the re-use of common subexpression, thus improving the efficiency of the search considerably. However, a question that remained open is, what is the complexity of the transformation-based enumeration process? In particular, with n the number of relations, does it achieve the O(3^n) lower bound established by [OL90]?

In this paper we examine the problem of duplicates, in transformation-based enumeration. In general, different sequences of transformation rules may end up deriving the same element, and the optimizer must detect and discard these duplicate elements generated by multiple paths. We show that the usual commutativity/associativity rules for joins generate O(4^n) duplicate operators. We then propose a scheme - within the generic transformation-based framework - to avoid the generation of duplicates, which does achieve the O(3^n) lower bound on join enumeration. Our experiments show an improvement of up to a factor of 5 in the optimization of a query with 8 tables, when duplicates are avoided rather than detected.

Copyright © 1997 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Matthias Jarke, Michael J. Carey, Klaus R. Dittrich, Frederick H. Lochovsky, Pericles Loucopoulos, Manfred A. Jeusfeld (Eds.): VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25-29, 1997, Athens, Greece. Morgan Kaufmann 1997, ISBN 1-55860-470-7
Contents BibTeX

Electronic Edition

From CS Dept., University Trier (Germany)


José A. Blakeley, William J. McKenna, Goetz Graefe: Experiences Building the Open OODB Query Optimizer. SIGMOD Conference 1993: 287-296 BibTeX
Surajit Chaudhuri, Kyuseok Shim: Optimizing Queries with Aggregate Views. EDBT 1996: 167-182 BibTeX
Goetz Graefe, David J. DeWitt: The EXODUS Optimizer Generator. SIGMOD Conference 1987: 160-172 BibTeX
César A. Galindo-Legaria, Arjan Pellenkoft, Martin L. Kersten: Uniformly-Distributed Random Generation of Join Orders. ICDT 1995: 280-293 BibTeX
César A. Galindo-Legaria, Arnon Rosenthal: Outerjoin Simplification and Reordering for Query Optimization. ACM Trans. Database Syst. 22(1): 43-73(1997) BibTeX
Goetz Graefe, William J. McKenna: The Volcano Optimizer Generator: Extensibility and Efficient Search. ICDE 1993: 209-218 BibTeX
Joseph M. Hellerstein, Jeffrey F. Naughton: Query Execution Techniques for Caching Expensive Methods. SIGMOD Conference 1996: 423-434 BibTeX
Yannis E. Ioannidis, Younkyung Cha Kang: Left-Deep vs. Bushy Trees: An Analysis of Strategy Spaces and its Implications for Query Optimization. SIGMOD Conference 1991: 168-177 BibTeX
Yannis E. Ioannidis, Eugene Wong: Query Optimization by Simulated Annealing. SIGMOD Conference 1987: 9-22 BibTeX
Younkyung Cha Kang: Randomized Algorithms for Query Optimization. Ph.D. thesis, Univ. of Wisconsin-Madison 1991
Rosana S. G. Lanzelotte, Patrick Valduriez, Mohamed Zaït: On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces. VLDB 1993: 493-504 BibTeX
William J. McKenna: Efficient Search in Extensible Query Optimization: The Volcano Optimizer Generator. Ph.D. thesis, University of Colorado-Boulder 1993
Kiyoshi Ono, Guy M. Lohman: Measuring the Complexity of Join Enumeration in Query Optimization. VLDB 1990: 314-325 BibTeX
Arun N. Swami, Anoop Gupta: Optimization of Large Join Queries. SIGMOD Conference 1988: 8-17 BibTeX
Praveen Seshadri, Joseph M. Hellerstein, Hamid Pirahesh, T. Y. Cliff Leung, Raghu Ramakrishnan, Divesh Srivastava, Peter J. Stuckey, S. Sudarshan: Cost-Based Optimization for Magic: Algebra and Implementation. SIGMOD Conference 1996: 435-446 BibTeX
Bennet Vance, David Maier: Rapid Bushy Join-order Optimization with Cartesian Products. SIGMOD Conference 1996: 35-46 BibTeX

Referenced by

  1. Florian Waas, César A. Galindo-Legaria: Counting, Enumerating, and Sampling of Execution Plans in a Cost-Based Query Optimizer. SIGMOD Conference 2000: 499-509
  2. Peter A. Boncz, Martin L. Kersten: MIL Primitives for Querying a Fragmented World. VLDB J. 8(2): 101-119(1999)
  3. Daniela Florescu, Alon Y. Levy, Ioana Manolescu, Dan Suciu: Query Optimization in the Presence of Limited Access Patterns. SIGMOD Conference 1999: 311-322
  4. Ramana Yerneni, Chen Li, Jeffrey D. Ullman, Hector Garcia-Molina: Optimizing Large Join Queries in Mediation Systems. ICDT 1999: 348-364
  5. Clara Nippl, Bernhard Mitschang: TOPAZ: a Cost-Based, Rule-Driven, Multi-Phase Parallelizer. VLDB 1998: 251-262
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:46:16 2009