@inproceedings{DBLP:conf/vldb/MorimotoIM97, author = {Yasuhiko Morimoto and Hiromu Ishii and Shinichi Morishita}, editor = {Matthias Jarke and Michael J. Carey and Klaus R. Dittrich and Frederick H. Lochovsky and Pericles Loucopoulos and Manfred A. Jeusfeld}, title = {Efficient Construction of Regression Trees with Range and Region Splitting}, booktitle = {VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25-29, 1997, Athens, Greece}, publisher = {Morgan Kaufmann}, year = {1997}, isbn = {1-55860-470-7}, pages = {166-175}, ee = {db/conf/vldb/MorimotoIM97.html}, crossref = {DBLP:conf/vldb/97}, bibsource = {DBLP, http://dblp.uni-trier.de} }BibTeX

We propose an efficient way of constructing regression trees in order to predict the objective numeric attribute values of given tuples. A regression tree is a rooted binary tree such that each internal node contains a test, which can be expressed as an RDB query, for splitting tuples into two disjoint classes and passing data in each class down to the left or right subtree. The mean of the objective attribute values at the leaf is used as the predicted value of the tuple.

To test a numeric attribute, traditional approaches use a guillotine-cut splitting that classifies data into those below a given value and others. Instead, we consider a family R of grid-regions in the plane associated with two given numeric attributes. We propose to use a test that splits data into those that lie inside a region R and those that lie outside.

The contributions of this paper are as follows. We present an efficient algorithm for computing R in R that minimizes the mean squared error after the introduction of the test with the region R. Experiments confirmed that the use of region splitting gives a smaller mean squared error of regression trees. Our approach can also generate smaller regression trees.

*Copyright © 1997 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.*

- Download PDF file (www.vldb.org, Darmstadt, Germany)
- Download PDF file (www.acm.org, New York, USA)

- Windows: Click the letter of your CD drive

A B C**D E**F G H I J K L M N O P Q R S T U V W X Y Z - Mac: Click here
- UNIX/LINUX: mount the CD and click on the path of your
*mount point*:

/Anthology/vldb8997 or /cdrom

- Windows: Click the letter of your CD drive

A B C**D E**F G H I J K L M N O P Q R S T U V W X Y Z - Mac: Click here
- UNIX/LINUX: mount the DVD and click on the path of your
*mount point*:

/Anthology/aDVD1 or /dvd

Contents BibTeX

- [ACKT96]
- Tetsuo Asano, Danny Z. Chen, Naoki Katoh, Takeshi Tokuyama: Polynomial-Time Solutions to Image Segmentation. SODA 1996: 104-113 BibTeX
- [AIS93]
- Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 BibTeX
- [AS94]
- Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 BibTeX
- [BFOS84]
- Leo Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone:
Classification and Regression Trees.
Wadsworth 1984, ISBN 0-534-98053-8

BibTeX - [FMMT96a]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Mining Optimized Association Rules for Numeric Attributes. PODS 1996: 182-191 BibTeX
- [FMMT96b]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization. SIGMOD Conference 1996: 13-23 BibTeX
- [FMMT96c]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules. VLDB 1996: 146-155 BibTeX
- [HF95]
- Jiawei Han, Yongjian Fu: Discovery of Multiple-Level Association Rules from Large Databases. VLDB 1995: 420-431 BibTeX
- [MAR96]
- Manish Mehta, Rakesh Agrawal, Jorma Rissanen: SLIQ: A Fast Scalable Classifier for Data Mining. EDBT 1996: 18-32 BibTeX
- [PCY95]
- Jong Soo Park, Ming-Syan Chen, Philip S. Yu: An Effective Hash Based Algorithm for Mining Association Rules. SIGMOD Conference 1995: 175-186 BibTeX
- [PS91]
- Gregory Piatetsky-Shapiro, William J. Frawley (Eds.):
Knowledge Discovery in Databases.
AAAI/MIT Press 1991, ISBN 0-262-62080-4

Contents BibTeX - [PSF91]
- Gregory Piatetsky-Shapiro: Discovery, Analysis, and Presentation of Strong Rules. Knowledge Discovery in Databases 1991: 229-248 BibTeX
- [Qui86]
- J. Ross Quinlan: Induction of Decision Trees. Machine Learning 1(1): 81-106(1986) BibTeX
- [Qui93]
- J. Ross Quinlan:
C4.5: Programs for Machine Learning.
Morgan Kaufmann 1993, ISBN 1-55860-238-0

BibTeX - [SA96]
- Ramakrishnan Srikant, Rakesh Agrawal: Mining Quantitative Association Rules in Large Relational Tables. SIGMOD Conference 1996: 1-12 BibTeX
- [YFM+97]
- Kunikazu Yoda, Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Computing Optimized Rectilinear Regions for Association Rules. KDD 1997: 96-103 BibTeX

- Shinichi Morishita, Jun Sese: Traversing Itemset Lattice with Statistical Metric Pruning. PODS 2000: 226-236
- Takeshi Fukuda, Hirofumi Matsuzawa: Parallel Processing of Multiple Aggregate Queries on Shared-Nothing Multiprocessors. EDBT 1998: 278-292

VLDB Proceedings: Copyright © by VLDB Endowment,

ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org

DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:46:15 2009