Random Sampling from Pseudo-Ranked B+ Trees.

Gennady Antoshenkov: Random Sampling from Pseudo-Ranked B+ Trees. VLDB 1992: 375-382
  author    = {Gennady Antoshenkov},
  editor    = {Li-Yan Yuan},
  title     = {Random Sampling from Pseudo-Ranked B+ Trees},
  booktitle = {18th International Conference on Very Large Data Bases, August
               23-27, 1992, Vancouver, Canada, Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1992},
  isbn      = {1-55860-151-1},
  pages     = {375-382},
  ee        = {db/conf/vldb/Antoshenkov92.html},
  crossref  = {DBLP:conf/vldb/92},
  bibsource = {DBLP,}


In the past, two basic approaches for sampling from B+ trees have been suggested: sampling from the ranked trees and acceptance/rejection sampling from non-ranked trees. The first approach requires the entire root-to-leaf path to be updated with each insertion and deletion. The second has no update overhead, but incurs a high rejection rate for the compressed-key B+ trees commonly used in practice. In this paper we introduce a new sampling method based on pseudo-ranked B+ trees, which are B+ trees supplemented with information loosely describing the estimated rank limits. This new method exhibits a very small rejection rate while paying only a marginal cost of the tree update overhead. We also present comparative efficiency measurements of different methods that were run on production databases and on several prototype workload simulations.

Copyright © 1992 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Li-Yan Yuan (Ed.): 18th International Conference on Very Large Data Bases, August 23-27, 1992, Vancouver, Canada, Proceedings. Morgan Kaufmann 1992, ISBN 1-55860-151-1
Contents BibTeX


Michael J. Carey, David J. DeWitt, Joel E. Richardson, Eugene J. Shekita: Object and File Management in the EXODUS Extensible Database System. VLDB 1986: 91-100 BibTeX
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Statistical Estimators for Relational Algebra Expressions. PODS 1988: 276-287 BibTeX
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Processing Aggregate Relational Queries with Hard Time Constraints. SIGMOD Conference 1989: 68-77 BibTeX
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
Richard J. Lipton, Jeffrey F. Naughton: Estimating the Size of Generalized Transitive Closures. VLDB 1989: 165-171 BibTeX
Richard J. Lipton, Jeffrey F. Naughton: Query Size Estimation by Adaptive Sampling. PODS 1990: 40-46 BibTeX
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 BibTeX
M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 1988: 28-36 BibTeX
Frank Olken, Doron Rotem: Simple Random Sampling from Relational Databases. VLDB 1986: 160-169 BibTeX
Frank Olken, Doron Rotem: Random Sampling from B+ Trees. VLDB 1989: 269-277 BibTeX
Gregory Piatetsky-Shapiro, Charles Connell: Accurate Estimation of the Number of Tuples Satisfying a Condition. SIGMOD Conference 1984: 256-276 BibTeX
Jaideep Srivastava, Vincent Y. Lum: A Tree Based Access Method (TBSAM) for Fast Processing of Aggregate Queries. ICDE 1988: 504-510 BibTeX

Referenced by

  1. Swarup Acharya, Viswanath Poosala, Sridhar Ramaswamy: Selectivity Estimation in Spatial Databases. SIGMOD Conference 1999: 13-24
  2. Phillip B. Gibbons, Yossi Matias: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. SIGMOD Conference 1998: 331-342
  3. Paul M. Aoki: Generalizing ``Search'' in Generalized Search Trees (Extended Abstract). ICDE 1998: 380-389
  4. Joseph M. Hellerstein: Online Processing Redux. IEEE Data Eng. Bull. 20(3): 20-29(1997)
  5. Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik: The New Jersey Data Reduction Report. IEEE Data Eng. Bull. 20(4): 3-45(1997)
  6. Gennady Antoshenkov, Mohamed Ziauddin: Query Processing and Optimization in Oracle Rdb. VLDB J. 5(4): 229-237(1996)
  7. Gennady Antoshenkov: Query Processing in DEC Rdb: Major Issues and Future Challenges. IEEE Data Eng. Bull. 16(4): 42-52(1993)
  8. Frank Olken, Doron Rotem: Sampling from Spatial Databases. ICDE 1993: 199-208
  9. Gennady Antoshenkov: Dynamic Query Optimization in Rdb/VMS. ICDE 1993: 538-547
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:45:53 2009