ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Computation of Partial Query Results Using An Adaptive Stratified Sampling Technique.

Augustine C. Ikeji, Farshad Fotouhi: Computation of Partial Query Results Using An Adaptive Stratified Sampling Technique. CIKM 1995: 145-149
@inproceedings{DBLP:conf/cikm/IkejiF95,
  author    = {Augustine C. Ikeji and
               Farshad Fotouhi},
  title     = {Computation of Partial Query Results Using An Adaptive Stratified
               Sampling Technique},
  booktitle = {CIKM '95, Proceedings of the 1995 International Conference on
               Information and Knowledge Management, November 28 - December
               2, 1995, Baltimore, Maryland, USA},
  publisher = {ACM},
  year      = {1995},
  pages     = {145-149},
  ee        = {db/conf/cikm/IkejiF95.html, http://doi.acm.org/10.1145/221270.221363},
  crossref  = {DBLP:conf/cikm/95},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

The proliferation of databases into many areas has resulted in the need for fast and efficient techniques for query processing. Large databases take too much time to process, and time constrained queries force the query processor to render a result in the available time. For applications such as these, partial query results (PQR) may be satisfactory. Sampling techniques have been used to estimate aggregate query results [3], [5], [6], [7], [13], however a direct application of the sampling algorithms to obtain partial results in non-aggregate queries is not optimal because the goals are different.

In this paper, we propose an algorithm for obtaining PQR for non-aggregate queries. Our algorithm is based on stratification of the target relation and sampling of the strata over several stages. The results obtained from the preceding stages are used to guide the choice of which strata to sample in the following stages. In other words, strata that possess more of the desired quality (rich) get sampled more at the beginning of the query processing while those that are poor may be sampled at the later stages if time permits. The experimental performance of our algorithm is presented and compared with other techniques for obtaining partial or entire query results. The results show that our algorithm out-performs the other algorithms by finding more tuples that belong to the result set in the early stages of the query processing especially when the query selectivity is high or the sample size is at least five percent of the population size.

Copyright © 1995 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 4, CIKM, DOLAP, GIS, SIGFIDET, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

CIKM '95, Proceedings of the 1995 International Conference on Information and Knowledge Management, November 28 - December 2, 1995, Baltimore, Maryland, USA. ACM 1995
Contents BibTeX

Online Edition

Citation Page BibTeX

References

[1]
Arie Shoshani, Harry K. T. Wong: Statistical and Scientific Database Issues. IEEE Trans. Software Eng. 11(10): 1040-1047(1985) BibTeX
[2]
Neil C. Rowe: Antisampling for Estimation: An Overview. IEEE Trans. Software Eng. 11(10): 1081-1091(1985) BibTeX
[3]
Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350 BibTeX
[4]
Y. C. Tay: On the Optimality of Strategies for Multiple Joins. J. ACM 40(5): 1067-1086(1993) BibTeX
[5]
Frank Olken, Doron Rotem: Simple Random Sampling from Relational Databases. VLDB 1986: 160-169 BibTeX
[6]
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 BibTeX
[7]
Stavros Christodoulakis: Estimating record selectivities. Inf. Syst. 8(2): 105-115(1983) BibTeX
[8]
...
[9]
...
[10]
...
[11]
...
[12]
...
[13]
Frank Olken, Doron Rotem: Random Sampling from B+ Trees. VLDB 1989: 269-277 BibTeX
[14]
Yibei Ling, Wei Sun: A Suppletment to Sampling-Based Methods for Query Size Estimation in a Database System. SIGMOD Record 21(4): 12-15(1992) BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
CIKM 1995 Proceedings, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:01:48 2009