CMD: A Multidimensional Declustering Method for Parallel Data Systems.

Jianzhong Li, Jaideep Srivastava, Doron Rotem: CMD: A Multidimensional Declustering Method for Parallel Data Systems. VLDB 1992: 3-14
I/O parallelism appears to be a promising approach to achieving high performance in parallel database systems. In such systems, it is essential to decluster database files into fragments andspread them across multiple disks so that the DBMS software can exploit the I/Obandwidth reading and writing the disks in parallel. In this paper, we consider the problem of declustering multidimensional data ona parallel disk system. Since the multidimensional range query is the main work-horse for applications accessing such data, our aim is to provide efficient support for it. A new declustering method for parallel disk systems, called coordinate modulo distribution (CMD), is proposed. Our analysis shows that the method achieves optimum parallelism for a very highpercentage of range queries on multidimensional data, if the distribution of data on each dimension is stationary. We have derived the exact conditions under which optimality is achieved. Also provided are the worst and average case bounds on multidimensional range query performance. Experimental results show that the method achieves near optimum performance in almost all cases even when the stationarity assumption does not hold. Details of the parallel algorithms for range query processing and data maintenance are also provided.

Copyright © 1992 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

