What can Hierarchies do for Data Warehouses?

H. V. Jagadish, Laks V. S. Lakshmanan, Divesh Srivastava: What can Hierarchies do for Data Warehouses? VLDB 1999: 530-541
  author    = {H. V. Jagadish and
               Laks V. S. Lakshmanan and
               Divesh Srivastava},
  editor    = {Malcolm P. Atkinson and
               Maria E. Orlowska and
               Patrick Valduriez and
               Stanley B. Zdonik and
               Michael L. Brodie},
  title     = {What can Hierarchies do for Data Warehouses?},
  booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
               Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
  publisher = {Morgan Kaufmann},
  year      = {1999},
  isbn      = {1-55860-615-7},
  pages     = {530-541},
  ee        = {db/conf/vldb/JagadishLS99.html},
  crossref  = {DBLP:conf/vldb/99},
  bibsource = {DBLP,}


Data in a warehouse typically has multiple dimensions of interest, such as location, time, and product. It is well-recognized that these dimensions have hierarchies defined on them, such as ``store-city-state-region'' for location. The standard way to model such data is with a star/snowflake schema. However, current approaches do not give a first-class status to dimensions. Consequently, a substantial class of interesting queries involving dimension hierarchies and their interaction with the fact tables are quite verbose to write, hard to read, and difficult to optimize.

We propose the SQL(H) model and a natural extension to the SQL query language, that gives a first-class status to dimensions, and we pin down its semantics. Our model permits structural and schematic heterogeneity in dimension hierarchies, situations often arising in practice that cannot be modeled satisfactorily using the star/snowflake approach. We show using examples that sophisticated queries involving dimension hierarchies and their interplay with aggregation can be expressed concisely in SQL(H). By comparison, expressing such queries in SQL would involve a union of numerous complex sequences of joins. Finally, we develop an efficient implementation strategy for computing SQL queries, based on an algorithm for hierarchical joins, and the use of dimension indexes.

Copyright © 1999 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, Michael L. Brodie (Eds.): VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK. Morgan Kaufmann 1999, ISBN 1-55860-615-7
Contents BibTeX


Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi: On the Computation of Multidimensional Aggregates. VLDB 1996: 506-521 BibTeX
Elena Baralis, Stefano Paraboschi, Ernest Teniente: Materialized Views Selection in a Multidimensional Database. VLDB 1997: 156-165 BibTeX
Luca Cabibbo, Riccardo Torlone: Querying Multidimensional Databases. DBPL 1997: 319-335 BibTeX
Chee Yong Chan, Yannis E. Ioannidis: Bitmap Index Design and Evaluation. SIGMOD Conference 1998: 355-366 BibTeX
Damianos Chatziantoniou, Kenneth A. Ross: Querying Multiple Features of Groups in Relational Databases. VLDB 1996: 295-306 BibTeX
Surajit Chaudhuri, Umeshwar Dayal: An Overview of Data Warehousing and OLAP Technology. SIGMOD Record 26(1): 65-74(1997) BibTeX
Douglas Comer: The Ubiquitous B-Tree. ACM Comput. Surv. 11(2): 121-137(1979) BibTeX
Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE 1996: 152-159 BibTeX
Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216 BibTeX
Carlos A. Hurtado, Alberto O. Mendelzon, Alejandro A. Vaisman: Maintaining Data Cubes under Dimension Updates. ICDE 1999: 346-355 BibTeX
H. V. Jagadish, Laks V. S. Lakshmanan, Tova Milo, Divesh Srivastava, Dimitra Vista: Querying Network Directories. SIGMOD Conference 1999: 133-144 BibTeX
Ralph Kimball: The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses. John Wiley 1996, ISBN 0-471-15337-0
Laks V. S. Lakshmanan, Fereidoon Sadri, Iyer N. Subramanian: SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems. VLDB 1996: 239-250 BibTeX
Wolfgang Lehner: Modelling Large Scale OLAP Scenarios. EDBT 1998: 153-167 BibTeX
Witold Litwin: Linear Hashing: A New Tool for File and Table Addressing. VLDB 1980: 212-223 BibTeX
Patrick E. O'Neil: Model 204 Architecture and Performance. HPTS 1987: 40-59 BibTeX
Patrick E. O'Neil, Goetz Graefe: Multi-Table Joins Through Bitmapped Join Indices. SIGMOD Record 24(3): 8-11(1995) BibTeX
Patrick E. O'Neil, Dallan Quass: Improved Query Performance with Variant Indexes. SIGMOD Conference 1997: 38-49 BibTeX
Kenneth A. Ross, Divesh Srivastava: Fast Computation of Sparse Datacubes. VLDB 1997: 116-125 BibTeX
Kenneth A. Ross, Divesh Srivastava, Damianos Chatziantoniou: Complex Aggregation at Multiple Granularities. EDBT 1998: 263-277 BibTeX
Patrick Valduriez: Join Indices. ACM Trans. Database Syst. 12(2): 218-246(1987) BibTeX
Jennifer Widom: Research Problems in Data Warehousing. CIKM 1995: 25-30 BibTeX
Yihong Zhao, Prasad Deshpande, Jeffrey F. Naughton: An Array-Based Algorithm for Simultaneous Multidimensional Aggregates. SIGMOD Conference 1997: 159-170 BibTeX

Referenced by

  1. Carlos A. Hurtado, Alberto O. Mendelzon: Reasoning about Summarizability in Heterogeneous Multidimensional Schemas. ICDT 2001: 375-389
  2. Theodore Johnson, Laks V. S. Lakshmanan, Raymond T. Ng: The 3W Model and Algebra for Unified Data Mining. VLDB 2000: 21-32
  3. Nick Koudas, S. Muthukrishnan, Divesh Srivastava: Optimal Histograms for Hierarchical Range Queries. PODS 2000: 196-204
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:46:28 2009