An Interval Classifier for Database Mining Applications.

Rakesh Agrawal, Sakti P. Ghosh, Tomasz Imielinski, Balakrishna R. Iyer, Arun N. Swami: An Interval Classifier for Database Mining Applications. VLDB 1992: 560-573
  author    = {Rakesh Agrawal and
               Sakti P. Ghosh and
               Tomasz Imielinski and
               Balakrishna R. Iyer and
               Arun N. Swami},
  editor    = {Li-Yan Yuan},
  title     = {An Interval Classifier for Database Mining Applications},
  booktitle = {18th International Conference on Very Large Data Bases, August
               23-27, 1992, Vancouver, Canada, Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1992},
  isbn      = {1-55860-151-1},
  pages     = {560-573},
  ee        = {db/conf/vldb/AgrawalGIIS92.html},
  crossref  = {DBLP:conf/vldb/92},
  bibsource = {DBLP,}


We are given a large population database that contains information about population instances. The population is known to comprise of m groups, but the population instances are not labeled with the group identification. Also given is a population sample (much smaller than the population but representative of it) in which the group labels of the instances are known. We present an interval classifier (IC) which generates a classification function for each group that can be used to efficiently retrieve all instances of the specified group from the population database. To allow IC to be embedded in interactive loops to answer adhoc queries about attributes with missing values, IC has been designed to be efficient in the generation of classification functions. Preliminary experimental results indicate that IC not only has retrievaland classifier generation efficiency advantages, but also compares favorably inthe classification accuracy with current tree classifiers, such as ID3, which were primarily designed for minimizing classification errors. We also describe some new applications that arise from encapsulating the classification capability in database systems and discuss extensions to IC forit to be used in these new application domains.

Copyright © 1992 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Li-Yan Yuan (Ed.): 18th International Conference on Very Large Data Bases, August 23-27, 1992, Vancouver, Canada, Proceedings. Morgan Kaufmann 1992, ISBN 1-55860-151-1
Contents BibTeX


Dina Bitton, David J. DeWitt, Carolyn Turbyfill: Benchmarking Database Systems A Systematic Approach. VLDB 1983: 8-19 BibTeX
Leo Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone: Classification and Regression Trees. Wadsworth 1984, ISBN 0-534-98053-8
Laurent Hyafil, Ronald L. Rivest: Constructing Optimal Binary Decision Trees is NP-Complete. Inf. Process. Lett. 5(1): 15-17(1976) BibTeX
Anil K. Jain, Richard C. Dubes: Algorithms for Clustering Data. Prentice-Hall 1988
Ravi Krishnamurthy, Tomasz Imielinski: Research Directions in Knowledge Discovery. SIGMOD Record 20(3): 76-78(1991) BibTeX
J. Ross Quinlan: Induction of Decision Trees. Machine Learning 1(1): 81-106(1986) BibTeX
J. Ross Quinlan, Ronald L. Rivest: Inferring Decision Trees Using the Minimum Description Length Principle. Inf. Comput. 80(3): 227-248(1989) BibTeX
Gregory Piatetsky-Shapiro, William J. Frawley (Eds.): Knowledge Discovery in Databases. AAAI/MIT Press 1991, ISBN 0-262-62080-4
Contents BibTeX
Shalom Tsur: Data Dredging. IEEE Data Eng. Bull. 13(4): 58-63(1990) BibTeX

Referenced by

  1. Sunil Choenni: Design and Implementation of a Genetic-Based Algorithm for Data Mining. VLDB 2000: 33-42
  2. Rakesh Agrawal, Ramakrishnan Srikant: Privacy-Preserving Data Mining. SIGMOD Conference 2000: 439-450
  3. Edwin M. Knorr, Raymond T. Ng: Finding Intensional Knowledge of Distance-Based Outliers. VLDB 1999: 211-222
  4. Wen-Chi Hou: A Framework for Statistical Data Mining with Summary Tables. SSDBM 1999: 14-23
  5. Johannes Gehrke, Venkatesh Ganti, Raghu Ramakrishnan, Wei-Yin Loh: BOAT-Optimistic Decision Tree Construction. SIGMOD Conference 1999: 169-180
  6. Mohammed Javeed Zaki, Ching-Tien Ho, Rakesh Agrawal: Parallel Classification for Data Mining on Shared-Memory Multiprocessors. ICDE 1999: 198-205
  7. Ming-Syan Chen, Jong Soo Park, Philip S. Yu: Efficient Data Mining for Path Traversal Patterns. IEEE Trans. Knowl. Data Eng. 10(2): 209-221(1998)
  8. Rajeev Rastogi, Kyuseok Shim: PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning. VLDB 1998: 404-415
  9. Edwin M. Knorr, Raymond T. Ng: Algorithms for Mining Distance-Based Outliers in Large Datasets. VLDB 1998: 392-403
  10. Johannes Gehrke, Raghu Ramakrishnan, Venkatesh Ganti: RainForest - A Framework for Fast Decision Tree Construction of Large Datasets. VLDB 1998: 416-427
  11. KianSing Ng, Huan Liu, HweeBong Kwah: A Data Mining Application: Customes Retention at the Port of Singapore Authority (PSA). SIGMOD Conference 1998: 522-525
  12. Sunil Choenni: On the Suitability of Genetic-Based Algorithms for Data Mining. ER Workshops 1998: 55-67
  13. Andrew K. C. Wong, Yang Wang: High-Order Pattern Discovery from Discrete-Valued Data. IEEE Trans. Knowl. Data Eng. 9(6): 877-893(1997)
  14. Jong Soo Park, Ming-Syan Chen, Philip S. Yu: Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Trans. Knowl. Data Eng. 9(5): 813-825(1997)
  15. Brian Lent, Arun N. Swami, Jennifer Widom: Clustering Association Rules. ICDE 1997: 220-231
  16. David Wai-Lok Cheung, Sau Dan Lee, Ben Kao: A General Incremental Technique for Maintaining Discovered Association Rules. DASFAA 1997: 185-194
  17. Edwin M. Knorr, Raymond T. Ng: Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining. IEEE Trans. Knowl. Data Eng. 8(6): 884-897(1996)
  18. Wen-Chi Hou: Extraction and Applications of Statistical Relationships in Relational Databases. IEEE Trans. Knowl. Data Eng. 8(6): 939-945(1996)
  19. David Wai-Lok Cheung, Vincent T. Y. Ng, Ada Wai-Chee Fu, Yongjian Fu: Efficient Mining of Association Rules in Distributed Databases. IEEE Trans. Knowl. Data Eng. 8(6): 911-922(1996)
  20. Ming-Syan Chen, Jiawei Han, Philip S. Yu: Data Mining: An Overview from a Database Perspective. IEEE Trans. Knowl. Data Eng. 8(6): 866-883(1996)
  21. John C. Shafer, Rakesh Agrawal, Manish Mehta: SPRINT: A Scalable Parallel Classifier for Data Mining. VLDB 1996: 544-555
  22. Rosa Meo, Giuseppe Psaila, Stefano Ceri: A New SQL-like Operator for Mining Association Rules. VLDB 1996: 122-133
  23. Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules. VLDB 1996: 146-155
  24. Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization. SIGMOD Conference 1996: 13-23
  25. Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Mining Optimized Association Rules for Numeric Attributes. PODS 1996: 182-191
  26. I-Min A. Chen: Query Answering Using Discovered Rules. ICDE 1996: 402-411
  27. Manish Mehta, Rakesh Agrawal, Jorma Rissanen: SLIQ: A Fast Scalable Classifier for Data Mining. EDBT 1996: 18-32
  28. Hongjun Lu, Rudy Setiono, Huan Liu: NeuroRule: A Connectionist Approach to Data Mining. VLDB 1995: 478-489
  29. Jong Soo Park, Ming-Syan Chen, Philip S. Yu: An Effective Hash Based Algorithm for Mining Association Rules. SIGMOD Conference 1995: 175-186
  30. Maurice A. W. Houtsma, Arun N. Swami: Set-Oriented Mining for Association Rules in Relational Databases. ICDE 1995: 25-33
  31. Show-Jane Yen, Arbee L. P. Chen: An Efficient Algorithm for Deriving Compact Rules from Databases. DASFAA 1995: 364-371
  32. Jong Soo Park, Ming-Syan Chen, Philip S. Yu: Efficient Parallel and Data Mining for Association Rules. CIKM 1995: 31-36
  33. Raymond T. Ng, Jiawei Han: Efficient and Effective Clustering Methods for Spatial Data Mining. VLDB 1994: 144-155
  34. Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499
  35. Jason Tsong-Li Wang, Gung-Wei Chirn, Thomas G. Marr, Bruce A. Shapiro, Dennis Shasha, Kaizhong Zhang: Combinatorial Pattern Discovery for Scientific Data: Some Preliminary Results. SIGMOD Conference 1994: 115-125
  36. Christos Faloutsos, M. Ranganathan, Yannis Manolopoulos: Fast Subsequence Matching in Time-Series Databases. SIGMOD Conference 1994: 419-429
  37. Rakesh Agrawal, Michael J. Carey, Christos Faloutsos, Sakti P. Ghosh, Maurice A. W. Houtsma, Tomasz Imielinski, Balakrishna R. Iyer, A. Mahboob, H. Miranda, Ramakrishnan Srikant, Arun N. Swami: Quest: A Project on Database Mining. SIGMOD Conference 1994: 514
  38. Rakesh Agrawal: Tutorial Database Mining. PODS 1994: 75-76
  39. Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Database Mining: A Performance Perspective. IEEE Trans. Knowl. Data Eng. 5(6): 914-925(1993)
  40. Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:45:54 2009