Fast Incremental Maintenance of Approximate Histograms.

Phillip B. Gibbons, Yossi Matias, Viswanath Poosala: Fast Incremental Maintenance of Approximate Histograms. VLDB 1997: 466-475
  author    = {Phillip B. Gibbons and
               Yossi Matias and
               Viswanath Poosala},
  editor    = {Matthias Jarke and
               Michael J. Carey and
               Klaus R. Dittrich and
               Frederick H. Lochovsky and
               Pericles Loucopoulos and
               Manfred A. Jeusfeld},
  title     = {Fast Incremental Maintenance of Approximate Histograms},
  booktitle = {VLDB'97, Proceedings of 23rd International Conference on Very
               Large Data Bases, August 25-29, 1997, Athens, Greece},
  publisher = {Morgan Kaufmann},
  year      = {1997},
  isbn      = {1-55860-470-7},
  pages     = {466-475},
  ee        = {db/conf/vldb/GibbonsMP97.html},
  crossref  = {DBLP:conf/vldb/97},
  bibsource = {DBLP,}


Many commercial database systems maintain histograms to summarize the contents of large relations and permit efficient estimation of query result sizes for use in query optimizers. Delaying the propagation of database updates to the histogram often introduces errors in the estimation. This paper presents new sampling-based approaches for incremental maintenance of approximate histograms. By scheduling updates to the histogram based on the updates to the database, our techniques are the first to maintain histograms effectively up-to-date at all times and avoid computing overheads when unnecessary. Our techniques provide highly-accurate approximate histograms belonging to the equi- depth and Compressed classes. Experimental results show that our new approaches provide orders of magnitude more accurate estimation than previous approaches.

An important aspect employed by these new approaches is a backing sample, an up-to-date random sample of the tuples currently in a relation. We provide efficient solutions for maintaining a uniformly random sample of a relation in the presence of updates to the relation.

Copyright © 1997 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Matthias Jarke, Michael J. Carey, Klaus R. Dittrich, Frederick H. Lochovsky, Pericles Loucopoulos, Manfred A. Jeusfeld (Eds.): VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25-29, 1997, Athens, Greece. Morgan Kaufmann 1997, ISBN 1-55860-470-7
Contents BibTeX

Electronic Edition

From CS Dept., University Trier (Germany)


Stavros Christodoulakis: Implications of Certain Assumptions in Database Performance Evaluation. ACM Trans. Database Syst. 9(2): 163-186(1984) BibTeX
Yannis E. Ioannidis, Stavros Christodoulakis: On the Propagation of Errors in the Size of Join Results. SIGMOD Conference 1991: 268-277 BibTeX
Robert Kooi: The Optimization of Queries in Relational Databases. Ph.D. thesis, Case Western Reserve University 1980
Viswanath Poosala: Histogram-Based Estimation Techniques in Database Systems. Ph.D. thesis, Univ. of Wisconsin-Madison 1997
Viswanath Poosala, Yannis E. Ioannidis: Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing. VLDB 1996: 448-459 BibTeX
Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita: Improved Histograms for Selectivity Estimation of Range Predicates. SIGMOD Conference 1996: 294-305 BibTeX
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 BibTeX
Jeffrey Scott Vitter: Random Sampling with a Reservoir. ACM Trans. Math. Softw. 11(1): 37-57(1985) BibTeX
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949

Referenced by

  1. Francesco Buccafurri, Filippo Furfaro, Domenico Saccà: Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic Approach. ICDT 2001: 390-404
  2. Yossi Matias, Jeffrey Scott Vitter, Min Wang: Dynamic Maintenance of Wavelet-Based Histograms. VLDB 2000: 101-110
  3. Dimitrios Gunopulos, George Kollios, Vassilis J. Tsotras, Carlotta Domeniconi: Approximating Multi-Dimensional Aggregate Range Queries over Real Attributes. SIGMOD Conference 2000: 463-474
  4. Viswanath Poosala, Venkatesh Ganti, Yannis E. Ioannidis: Approximate Query Answering using Histograms. IEEE Data Eng. Bull. 22(4): 5-14(1999)
  5. H. V. Jagadish, Nick Koudas, S. Muthukrishnan: Mining Deviants in a Time Series Database. VLDB 1999: 102-113
  6. Arnd Christian König, Gerhard Weikum: Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation. VLDB 1999: 423-434
  7. Yannis E. Ioannidis, Viswanath Poosala: Histogram-Based Approximation of Set-Valued Query-Answers. VLDB 1999: 174-185
  8. Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala: Aqua: A Fast Decision Support Systems Using Approximate Query Answers. VLDB 1999: 754-757
  9. Viswanath Poosala, Venkatesh Ganti: Fast Approximate Answers to Aggregate Queries on a Data Cube. SSDBM 1999: 24-33
  10. Peter J. Haas: Techniques for Online Exploration of Large Object-Relational Datasets. SSDBM 1999: 4-12
  11. Gurmeet Singh Manku, Sridhar Rajagopalan, Bruce G. Lindsay: Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets. SIGMOD Conference 1999: 251-262
  12. Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy: Join Synopses for Approximate Query Answering. SIGMOD Conference 1999: 275-286
  13. Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy: The Aqua Approximate Query Answering System. SIGMOD Conference 1999: 574-576
  14. Ashraf Aboulnaga, Surajit Chaudhuri: Self-tuning Histograms: Building Histograms Without Looking at Data. SIGMOD Conference 1999: 181-192
  15. Noga Alon, Phillip B. Gibbons, Yossi Matias, Mario Szegedy: Tracking Join and Self-Join Sizes in Limited Storage. PODS 1999: 10-20
  16. Yossi Matias, Jeffrey Scott Vitter, Min Wang: Wavelet-Based Histograms for Selectivity Estimation. SIGMOD Conference 1998: 448-459
  17. Phillip B. Gibbons, Yossi Matias: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. SIGMOD Conference 1998: 331-342
  18. Surajit Chaudhuri, Rajeev Motwani, Vivek R. Narasayya: Random Sampling for Histogram Construction: How much is enough? SIGMOD Conference 1998: 436-447
  19. Surajit Chaudhuri: An Overview of Query Optimization in Relational Systems. PODS 1998: 34-43
  20. Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik: The New Jersey Data Reduction Report. IEEE Data Eng. Bull. 20(4): 3-45(1997)
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:46:18 2009