Welcome to DiSC 2002
SIGMOD 2001
PODS 2001
 SIGMOD RECORD 2001
CIKM 2001
CoopIS 2001
DASFAA 2001
DASFAA 2000
DBPL 2001
Data Engineering Bul
DEXA_EC-WEB 2001
DMKD 2001
 DPDJ 2001
HYPERTEXT 2001
ICDE 2001
ICDM 2001
ICDT 2001
JCDL 2001
KDD 2001
 KDD_EXPLORATIONS 20
KRDB 2001
MDM 2001
MIR 2001
MIS 2001
RIDE 2001
 = RIDE'01 Website
 = Invited Talks
<<< = RIDE'01 papers>>>
SBBD 2001
 SIGIR 2001
 SIGIR FORUM 2001
SSDBM 2001
SSTD 2001
TODS 2001
TIME 2001
VLDB 2001
VLDBJ 2001

Towards Self-Validating Knowledge-Based Archives


Bertram Ludäscher, Richard Marciano, and Reagan Moore

  View Paper (PDF)  

Return to Document Management and Data Intensive Applications


Abstract

Digital archives are dedicated to the long-term preservation of electronic information and have the mandate to enable sustained access despite a rapidly changing information infrastructure. Current archival approaches build upon standardized data formats and simple metadata mechanisms for collection management, but do not involve high-level conceptual models and knowledge representations. This results in serious limitations, not only for expressing various kinds of information and knowledge about the archived data, but also for creating infrastructure independent, self-validating and self-instantiating archives. To overcome these limitations, we first propose a scalable XML-based archival infrastructure, based on standard tools, and subsequently show how this architecture can be extended to a model-based framework, where higher-level knowledge representations become an integral part of the archive and the ingestion/migration processes. This allows us to maximize infrastructure independence by archiving generic, executable specifications of: archival constraints (i.e., model validators); and archival transformations that are part of the ingestion process. The proposed architecture facilitates construction of self-validating and self-instantiating knowledge-based archives. We illustrate our overall approach and report on first experiences using a sample collection from a collaboration with the National Archives and Records Administration (NARA).


DiSC'02 © 2003 Association for Computing Machinery