Multiway-Tree Retrieval Based on Treegrams.
Hans Argenton, Ulrich Güntzer:
Multiway-Tree Retrieval Based on Treegrams.
ADBIS 1997: 189-195@inproceedings{DBLP:conf/adbis/ArgentonG97,
author = {Hans Argenton and
Ulrich G{\"u}ntzer},
title = {Multiway-Tree Retrieval Based on Treegrams},
booktitle = {Proceedings of the First East-European Symposium on Advances
in Databases and Information Systems (ADBIS'97), St.-Petersburg,
September 2-5, 1997. Volume 1: Regular Papers},
publisher = {Nevsky Dialect},
year = {1997},
pages = {189-195},
ee = {db/conf/adbis/ArgentonG97.html},
crossref = {DBLP:conf/adbis/97},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Abstract
Large tree databases as knowledge repositories become more and more important;
a prominent example are the treebanks in computational linguistics:
text corpora consisting of up to five million words tagged with syntactic
information. Consequently, these large amounts of structured data pose the
problem of fast tree retrieval: Given a database T of labeled multiway trees and
a query tree q, find efficiently all trees t in T that contain q as
subtree. This paper presents a generalization of the classical n-gram
indexing technique for supporting fast retrieval of multiway tree structures:
Treegram indexing covers database trees with subtrees of fixed height;
each entry of the resulting index represents such a subtree together with the
database trees that contain this subtree. The evaluation of a given query q
preselects those database trees that contain all of q's cover trees and,
in turn, tests these candidates rigorously for containment of q.
As an application of treegram indexing, we describe the Venona retrieval system,
which handles the BHt tree-bank containing 508,650 phrase structure
trees found in the morphosyntactical analysis of The Old Testament with
altogether 3.3 million wordforms - results of a computational-linguistics
project at the Ludwig-Maximilian's University of Munich.
Copyright © 1997 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 2 Issue 5, SSDBM, DBPL, KRDB, ADBIS, COOPIS, SIGBDP" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
BibTeX
References
- [1]
- ...
- [2]
- ...
- [3]
- ...
- [4]
- William B. Cavnar:
Using An N-Gram-Based Document Representation With A Vector Processing Retrieval Model.
TREC 1994: 0- BibTeX
- [5]
- Moshe Dubiner, Zvi Galil, Edith Magen:
Faster Tree Pattern Matching.
J. ACM 41(2): 205-213(1994) BibTeX
- [6]
- ...
- [7]
- ...
- [8]
- Donald E. Knuth:
The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition.
Addison-Wesley 1973
BibTeX
- [9]
- S. Rao Kosaraju:
Efficient Tree Pattern Matching (Preliminary Version).
FOCS 1989: 178-183 BibTeX
- [10]
- ...
- [11]
- ...
- [12]
- ...
- [13]
- ...
- [14]
- ...
- [15]
- Günther Specht, Burkhard Freitag:
AMOS: A Natural Language Parser Implemented as a Deductive Database in LOLA.
Workshop on Programming with Logic Databases (Book), ILPS 1993: 197-215 BibTeX
- [16]
- ...
BibTeX
ACM SIGMOD Anthology - DBLP:
[Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 22:56:31 2009