Welcome to D
SIGMOD'00
 = SIGMOD'00 We
 = Plenary Talk
<<< = SIGMOD'00 Pa>>>
PODS'00
SIGMOD Recor
CIKM 2000/CI
COMAD 2000
Data Enginee
DL 2000
DPDJ
EDBT 2000
Hypertext 20
ICDE 2000
KDD 2000
KDD Explorat
KRDB 2000
SBBD 2000
SIGIR 2000
SIGIR Forum
SSDBM 2000
TODS
VLDB'00
VLDBJ

Approximating Multi-Dimensional Aggregate Range Queries over Real Attributes


Dimitrios Gunopulos, George Kollios, Vassilis J. Tsotras, and Carlotta Domeniconi

  View Paper (PDF)  

Return to Research Sessions


Abstract

Finding approximate answers to multi-dimensional range queries over real valued attributes has significant applications in data exploration and database query optimization. In this paper we consider the following problem: given a table of d attributes whose domain is the real numbers, and a query that specifies a range in each dimension, find a good approximation of the number of records in the table that satisfy the query.

We present a new histogram technique that is designed to approximate the density of multi-dimensional datasets with real attributes. Our technique finds buckets of variable size, and allows the buckets to overlap. Overlapping buckets allow more efficient approximation of the density. The size of the cells is based on the local density of the data. This technique leads to a faster and more compact approximation of the data distribution. We also show how to generalize kernel density estimators, and how to apply them on the multi-dimensional query approximation problem.

Finally, we compare the accuracy of the proposed techniques with existing techniques using real and synthetic datasets.


References


Note: References link to DBLP on the Web.

[1]
Ashraf Aboulnaga , Surajit Chaudhuri : Self-tuning Histograms: Building Histograms Without Looking at Data. SIGMOD Conference 1999 : 181-192
[2]
Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala , Sridhar Ramaswamy : Join Synopses for Approximate Query Answering. SIGMOD Conference 1999 : 275-286
[3]
Swarup Acharya , Viswanath Poosala , Sridhar Ramaswamy : Selectivity Estimation in Spatial Databases. SIGMOD Conference 1999 : 13-24
[4]
Bjorn Blohsfeld , Dieter Korus , Bernhard Seeger : A Comparison of Selectivity Estimators for Range Queries on Metric Attributes. SIGMOD Conference 1999 : 239-250
[5]
Surajit Chaudhuri , Luis Gravano : Evaluating Top- k Selection Queries. VLDB 1999 : 397-410
[6]
Surajit Chaudhuri , Rajeev Motwani , Vivek R. Narasayya : Random Sampling for Histogram Construction: How much is enough? SIGMOD Conference 1998 : 436-447
[7]
...
[8]
...
[9]
Donko Donjerkovic , Raghu Ramakrishnan : Probabilistic Optimization of Top N Queries. VLDB 1999 : 411-422
[10]
Phillip B. Gibbons , Yossi Matias : New Sampling-Based Summary Statistics for Improving Approximate Query Answers. SIGMOD Conference 1998 : 331-342
[11]
Phillip B. Gibbons , Yossi Matias , Viswanath Poosala : Fast Incremental Maintenance of Approximate Histograms. VLDB 1997 : 466-475
[12]
...
[13]
Peter J. Haas , Arun N. Swami : Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992 : 341-350
[14]
Joseph M. Hellerstein , Peter J. Haas , Helen Wang : Online Aggregation. SIGMOD Conference 1997 : 171-182
[15]
...
[16]
Yannis E. Ioannidis , Viswanath Poosala : Histogram-Based Approximation of Set-Valued Query-Answers. VLDB 1999 : 174-185
[17]
H. V. Jagadish , Nick Koudas , S. Muthukrishnan , Viswanath Poosala , Kenneth C. Sevcik , Torsten Suel : Optimal Histograms with Quality Guarantees. VLDB 1998 : 275-286
[18]
Sanjeev Khanna , S. Muthukrishnan , Mike Paterson : On Approximating Rectangle Tiling and Packing. SODA 1998 : 384-393
[19]
Arnd Christian König , Gerhard Weikum : Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation. VLDB 1999 : 423-434
[20]
Flip Korn , Theodore Johnson , H. V. Jagadish : Range Selectivity Estimation for Continuous Attributes. SSDBM 1999 : 244-253
[21]
Richard J. Lipton , Jeffrey F. Naughton , Donovan A. Schneider : Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990 : 1-11
[22]
Ju-Hong Lee , Deok-Hwan Kim , Chin-Wan Chung : Multi-dimensional Selectivity Estimation Using Compressed Histogram Information. SIGMOD Conference 1999 : 205-214
[23]
Yossi Matias , Jeffrey Scott Vitter , Min Wang : Wavelet-Based Histograms for Selectivity Estimation. SIGMOD Conference 1998 : 448-459
[24]
M. Muralikrishna , David J. DeWitt : Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 1988 : 28-36
[25]
Frank Olken , Doron Rotem : Random Sampling from Database Files: A Survey. SSDBM 1990 : 92-111
[26]
Viswanath Poosala , Venkatesh Ganti : Fast Approximate Answers to Aggregate Queries on a Data Cube. SSDBM 1999 : 24-33
[27]
Viswanath Poosala , Yannis E. Ioannidis : Selectivity Estimation Without the Attribute Value Independence Assumption. VLDB 1997 : 486-495
[28]
Viswanath Poosala , Yannis E. Ioannidis , Peter J. Haas , Eugene J. Shekita : Improved Histograms for Selectivity Estimation of Range Predicates. SIGMOD Conf. 1996 : 294-305
[29]
...
[30]
Patricia G. Selinger , Morton M. Astrahan , Donald D. Chamberlin , Raymond A. Lorie , Thomas G. Price : Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979 : 23-34
[31]
Jayavel Shanmugasundaram , Usama M. Fayyad , P. S. Bradley : Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions. KDD 1999 : 223-232
[32]
...
[33]
Yannis Theodoridis , Timos K. Sellis : A Model for the Prediction of R-tree Performance. PODS 1996 : 161-171
[34]
Jeffrey Scott Vitter , Min Wang : Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets. SIGMOD Conference 1999 : 193-204
[35]
Jeffrey Scott Vitter , Min Wang , Balakrishna R. Iyer : Data Cube Approximation and Histograms via Wavelets. CIKM 1998 : 96-104
[36]
...
[37]
Roger Weber , Hans-Jörg Schek , Stephen Blott : A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. VLDB 1998 : 194-205

BIBTEX


@inproceedings{DBLP:conf/sigmod/GunopulosKTD00,
  author    = {Dimitrios Gunopulos and
                George Kollios and
                Vassilis J. Tsotras and
                Carlotta Domeniconi},
   editor    = {Weidong Chen and
                Jeffrey F. Naughton and
                Philip A. Bernstein},
   title     = {Approximating Multi-Dimensional Aggregate Range Queries over
                Real Attributes},
   booktitle = {Proceedings of the 2000 ACM SIGMOD International Conference on
                Management of Data, May 16-18, 2000, Dallas, Texas, USA},
   journal   = {SIGMOD Record},
   publisher = {ACM},
   volume    = {29},
   number    = {2},
   year      = {2000},
   isbn      = {1-58113-218-2},
   pages     = {463-474},
   crossref  = {DBLP:conf/sigmod/2000},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },




DiSC'01 Copyright ©2002 ACM Inc.