An Extensible Classifier for Semi-Structured Documents.
Markus Tresch, Allen Luniewski:
An Extensible Classifier for Semi-Structured Documents.
CIKM 1995: 226-233@inproceedings{DBLP:conf/cikm/TreschL95,
author = {Markus Tresch and
Allen Luniewski},
title = {An Extensible Classifier for Semi-Structured Documents},
booktitle = {CIKM '95, Proceedings of the 1995 International Conference on
Information and Knowledge Management, November 28 - December
2, 1995, Baltimore, Maryland, USA},
publisher = {ACM},
year = {1995},
pages = {226-233},
ee = {db/conf/cikm/TreschL95.html, http://doi.acm.org/10.1145/221270.221575},
crossref = {DBLP:conf/cikm/95},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
In this paper, we present a vector space classifier for determining the type of semi-structured documents.
Our goal was to design a high-performance classifier in terms of accuracy (recall and precision), speed, and flexibility.
The ability to dynamically extend a classifier with user-specific classes is crucial for many applications.
Unfortunately, the training data of existing classes is often not available, such that the extended classifier is imprecise as a result.
We focus on this issue. First, we evaluate how to create class abstracts that can be used as training data replacement. Second, we introduce relevance feedback learning strategies to overcoming the remaining classifier flaw.
Copyright © 1995 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 2 Issue 4, CIKM, DOLAP, GIS, SIGFIDET, ..." and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
Printed Edition
CIKM '95, Proceedings of the 1995 International Conference on Information and Knowledge Management, November 28 - December 2, 1995, Baltimore, Maryland, USA.
ACM 1995
Contents BibTeX
Online Edition
Citation Page
BibTeX
References
- [BFOS84]
- Leo Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone:
Classification and Regression Trees.
Wadsworth 1984, ISBN 0-534-98053-8
BibTeX
- [GRW84]
- ...
- [Har92]
- Donna Harman:
Relevance Feedback Revisited.
SIGIR 1992: 1-10 BibTeX
- [Hoc94]
- Rainer Hoch:
Using IR Techniques for Text Classification in Document Analysis.
SIGIR 1994: 31-40 BibTeX
- [Hon94]
- ...
- [Ide71]
- ...
- [Jam85]
- Mike James:
Classification Algorithms.
John Wiley 1985, ISBN 0-471-84799-2
BibTeX
- [Jon71]
- ...
- [MBK91]
- Yoëlle S. Maarek, Daniel M. Berry, Gail E. Kaiser:
An Information Retrieval Approach For Automatically Constructing Software Libraries.
IEEE Trans. Software Eng. 17(8): 800-813(1991) BibTeX
- [ODL93]
- Katia Obraczka, Peter B. Danzig, Shih-Hao Li:
Internet Resource Discovery Services.
IEEE Computer 26(9): 8-22(1993) BibTeX
- [Qui93]
- J. Ross Quinlan:
C4.5: Programs for Machine Learning.
Morgan Kaufmann 1993, ISBN 1-55860-238-0
BibTeX
- [Roc71]
- ...
- [SB90]
- ...
- [SLS+93]
- Kurt A. Shoens, Allen Luniewski, Peter M. Schwarz, James W. Stamos, Joachim Thomas II:
The Rufus System: Information Organization for Semi-Structured Data.
VLDB 1993: 97-107 BibTeX
- [SWY75]
- Gerard Salton, A. Wong, C. S. Yang:
A Vector Space Model for Automatic Indexing.
Commun. ACM 18(11): 613-620(1975) BibTeX
- [TPL94]
- Markus Tresch, Neal Palmer, Allen Luniewski:
Type Classification of Semi-Structured Documents.
VLDB 1995: 263-274 BibTeX
- [vR79]
- C. J. van Rijsbergen:
Information Retrieval.
Butterworth 1979, ISBN 0-408-70929-4
BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
CIKM 1995 Proceedings, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:01:49 2009